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The dim ot the conference was to hold a working 
discussion on problems essential to the building ot thesauri, a 
uniform comprehension ot which would make it possible to build them 
in a manner ensuring easier communication and exchange of information 
between information systems covering different ranges ot subject 
matter, and using different languages. The conference discussed the 
way essential terms such as "thesaurus," "descriptor," and 
"<iia 6 £i.ptd£" iiauid be adndied. The fundamental elements ot which 
thesauri should consist were also discussed and special stress as 
laid on methods of thesauri building, selecting and qualifying 
descriptors, their structure, and interrelations and determination. 
The conference discussed the building of monolingual thesauri and 
some problems of polylingual thesauri and debated the Unesco 
publication, "Guidelines for the Establishment and Development of 
Monolingual Scientific and Technical Thesauri for Information 
Retrieval: third draft." (Author/AB) 
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the IE!FEElIUTIOHAL Conference on General Principles of ^e- 
saurl Building, organised In Warsaup from 23rd-2?th March 1970 
by the Documentation and Scientific Information Centre of the 
Polish Academy of Sciences, was attended by 5? participants 
from 13 countries. The aim of the conference was to hold a wor- 
king discussion on problems essential to the building of the- 
sauri, a uniform comprehension of which would make It possible 
to build them in a manner ensuring easier communication and 
exchange of information between information systems covering 
different ranges of subject matter, and using different langu- 
ages. 

This being 80 , the agenda provided only for a brief presen- 
tation by the participants of tbelx particular views on Issues 
which were In principle set out In the (Questionnaire, and most 
of the time was spent In dlacusslng these matters. But other 
(Questions could be and were raised. 

With a vlaw to clarifying the Ideas and meanlngw ascribed 
to particular terms, the conference discussed primarily the way 
these texsBS are to be understood. The terms Included the essen- 
tial ones such aa "thesaurus", "descriptor" and "ascrlptor" 
/"non-desorlptor"/, and a great measure of unanimity was shown. 
The fundamental elements of which thesauri should consist were 
also discussed. Special stress was laid on methods of tbasaurl 
building, selecting and qualifying descriptors, their structure, 
Interrelations and datermlnatlon. 

conferenoa dlscusaad tba building of monolingual tha— 
aauri and some problems of polylingual thesauri, debated the 
0HBSCO publication "Guidelines for the Establishment and Deve- 
lopment of Monolingual Scientific and Technical Thesauri for In- 
formation Betrleval: third draft", and tabled a number of amend- 
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aenta and raaarScs* Varying points of Tlew wexe exprasaed on 
theaaurlt their structure * methods of hullding and even their 
very essence. As tbe causes of the differences were ezposedi 
there were chances to resolve them. Partial agreement on some 
haslc formulations was also reached. 

In publishing the preliminary material on the conference | 
its course and results | I would like to emphasise that these 
were only made possible by the dedication^ effort and goodwill 
of the participants. I hereby thank them all. I would also 
like to express my deep gratitude to Professor Januss Grosskow- 
ski, President of the Polish Academy of Sciences t for his active 
support and for his opening address. And I must also praise the 
important contribution to proceedings made by the conference 
secretary, Urs. Barbara Krygler* 
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1. The role of thaaanrl 
2* tazalBoXogleal probXoBa 

a* fha daflQltion of a thoaanrua 
b. fba definition of a deacrlptor 
e* fha definition and nano of forbidden tana 
d* that abould bo dananded in ordar to aeeapt a tem aa a 
daaerlptorT 

a* The tamlnoXogf at eroae-raf arene ea 
3* ConatmetlonaX probXana 

a* fhieb aXananta ahonXd be IneXnded in a thaeanma? 
b. Boo abooXd a deaeriptor be bnlXtr 
4* HethodoXogleaX probXaaa 

a* Methoda of bnlXdlbg tbaaanrl 

- for glTon tbanetie rangsa 

- fbr oforXapping flaXda /with other tbaaatlca/ 

- of anXtlXlngnaX tbaaanrl 

b* The nethoda of bnlXdlng daaerlptora 
3* Conditiona which deacriptora and thaaanrl anad. fnXflX in orte 
to enanre their Intar-braneh and Inter-Xangnaga correlation 
6. Conditiona which nnat be fnXfilXed ha deacriptora and tti o aaur l 
aa tools for further . deraXopwent* of inf oxwation 
7* Qcgaaiaational probXewa 

a* Boa to organisa the popuXariaation of nethoda agraed« to 
glTO then .the XereX of reconneodationa 
b* B«?v to organise work in order to faeilitata or make poaa-* 
ibXa the bxtiXding of correlated theaanri 
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OPXNIHG ASBBSSS 



I take great plea8tu^e in opening this conference « one of 
tbe first devoted to the hulidlng and development of thesaxirl* 

I am also vex; pleased to greet the s^cpresentatlTes of coun- 
tries with vaxylDg attitudes to the prohleas of ecience and 
scientific InforjiBtion. 

Despite all differences t the main tendency is towards co~* 
operation 9 especially In the scle;!itlflc field r 

Scientific information is* of course t the be sis of research* 
Vlthout accurate information on the state of a given field of 
sclenoet without sufficient knowledge of trends in other fields» 
without the recording of new achievements 9 scientific research 
cannot be praotlcally efficient it cannot yield good results* 

Developmenta which should he known to scientific workers 
and to econoBdsta taka place in different counlirlee, ere worked 
out in differing languages and in different fields. ^ disse- 
Bination of such InforaBtion precisely end guickly^aod maeti.ng 
all practical needs » is not a simple problem. Only systems 
irtilch have similar means of determining the content cf documents 
and an unamblguouB method of Ingulry can assure success. 

And here I come to the main aim of today *s conference. At 
tl» present stage of developoient at scientific Infoznatlon aud 
technical methods at the disposal o.f the retrieval ayBt«i>» the 
moat effective methodological tool to enable the preclee assess**’ 
sent of Inf ormatlont documents and retrieval seetms to be pro- 
cisely the thesauri. 

There is still a danger^ howeveXf that the thesauri elabfl>- 
xeted in varioue oountriea and fields spontaneously* with only 
e small degres of co^ordiunfced activity* though s tresBllning 
the infornation system witiiin the frajcewcrkc for which thisy 
are desigoad* may create additional lingulatic* lwter-Toc?‘unch 
and even inter^institutional herrieis against a free flow of 
infonation. 

The task before the menbers of this conference is to ooih- 
tribute to ovnrcoming such barriers, or at least to reduce Um 
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to oanageable proportions. This is a difficult and ambitious 
task. I believet however, that considering even the practical 
difficulties which are yet to arise, a great deal can be done. 

I also hope that the presence at this conference of re- 
presentatives of UMBSCO, UCSU and PH> — organisations with 
great international authority — will contribute to overcoming 
those difficulties. 

I am happy to have the opportunity to wish the debates 
every success — in terms of concrete results — and I also 
hope that even the planned aim is not fully achieved, at 
least a worthwhile approach will have been made. 

Now, since work is not the only thing in life, I also 
trust that you will spend your free time enjoyably, and may 
your memories of this visit bring you to Warsaw again in the 
future. 

Once again wishing you fruitfxil discussions, I now give 
the floor to the organisers, and declare the conference open. 

Prof, dr J. Groszkowskl 
President 
of the Polish 
Acadeny of Sciences 
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iSSWSBS TO QIZESTIOHHAIBE OK TBESAtIRDB ifiOBI£aB 
Itonaa M* Aitohlaon^ 

1* Definition of a thesaurus 

A thesaurus is an alphabetical listing of ooncepts /i.e. 
descriptors / which provides structural and relational infor- 
mation about the concepta* A Hat of terms which does not 
Include structural and relational Information Is not a the- 
saurus, even though it Includes detalljs of ajnocTBous terms* 

It Is merelj an alphabetical list of subject headings or an 
alphabetical Hat of descriptors* 

- Which atmctural elements /semantic . syntactic > etc*/ 
should be Included in order to be able to call a given 
oonstmctlon e thesaurus? 

A tdiesaurus should include the foUoalngt • 

a/ Conoepta /i*e* dtosorlptors/«mged in alphabetical order 

V Details about each oono^, l*e* 

/!/ SjnoQTBS^ alternative word forms, near S7nonjns,eto* 
Tor exa^plet- 
AUTCafOBIIK^ 

DT Vot^ oars 
with reciprocal enti^* 
iOXOfi CABS use 
ADTCWBlin 

/Uf represents **Dae for^/ 

/IV Uerarohioal /or structural/ reiatiooships* 

7br emnqplet- 
AiroaBOBiiaB 
BT Motor vehicles 



Institution of SXectrlcaX Snglneersi Xondon* 
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K!F Estate cars 

/BT represents ^'Broader term". 

HT represents ”Karrower term”/ 

/111/ Relationships other than hierarchical. There are 
a wide variety of related texina» which oiay be 
roughly classified Into groups » such as ** thing/ 
part”t ” thing/property”, ” process/agent”, ’•thing/ 
application’’ to name only a few. 

Bor example 
Thing/part 
DIBSSL ergdie: 

RT Pistons 
Thing/property 
Pistons 

RT Wear resistance 
Property/process 
WEAR BESX5TARCE 
RT Testing 

/RT represents ’•Belated term*V 

Which elements, factors. Influence the oTf^nifl atlon of 
the thesaurus ? 

a/ Subject field 

/!/ General vs. specific subject fields. A thesauros 
in a specific subject field will cover the subr 
ject field in greater detail than the general 
thesaurus. Some relationships between concepts 
are peculiar to a specific subject field. The 
same terms in a general thesaurus might not dis- 
play the specific relationships. 

/ii/ Differences between specific subject fields. 

Some subject areas have more precisely definable 
descriptors than others /i.e. "hard •’ versus 
•’soft” language types/. In senie subject areas 
synonyms abound » whilst in others they are not 
so common. In some subject fields the related 
terms and generic structure are more obvious 
than in others. 



b/ Bconomlc considerations 

If costs must be kept low, this Influences the ap6cl-> 
flcit^ of the concepts selected and the number of re-> 
lated terms ehlch are Introduced* 

- How should the degree of camplexlty and the number of ln » 
formation! Itema, contained la a theaaurua be evaluated ? 

The more specific the concepts selected and the more 
highly "pre-coordlnated** the descriptors » the more comp- 
lex the thesaurus will become and the more terms It will 
include. A thesaurus In which the teime are at a **low 
level of pre-coordination” /with complex teas being con- 
structed from these simple concepts at the indexing atage/ 
will have many fewer terms than those wlth”hi^hly pre- 
coordinated” concepts* The disadvantages of low pre- 
coordination level concepts is that frequently-sofught 
concepts are not Included in the thesaurus and the des- 
cription of the subject field is incomplete. If a con- 
cept is not liistedf related concepts cannot be shown. 
There is need for research to find the optlmim level of 
specificity /i.e. pre-coordination level/. 

2. Is the concept of a thesaurus 

sufficiently complete and 
univocallj* and exhaustively 

def ine d« or does it 
necessitate further analysis? 

Further analysis is required on the following points i- 
a/ Buies for the aelectlon of related terms /RT/. At* pre- 
sent the choice of related tenas seems to be made 
haphaeardly. Where thesauri are constructed on a classi- 
fied basis some help is given in this problea«but there 
are relationships which cut across hierarchical groups 
which can only be Identified by thoae with a good know- 
ledge of the subject field. It may be that it is impos- 
sible to draw up detailed rules for selection of related 
terms. 
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V Selection of apeciflcltx and pre-coordination level of con- 
cepts* /see above/ 

c/ !Tbesauru8 /classification sTSteas* Advantages/disadvantages 
of conbining tbesauri witb classification sjrstens* Probleas 
in eonstraetion and use* 

Advantages* 

/!/ fbesanrus and classification in a cdtblned sjb tea coa- 
plement eacb other* The classification provides a vi- 
sual diaplay of subject fields shoeing hierarchical 
and other relations whilst the theaauros acts as an 
index to the classification and at the saae tlae ensu- 
res control or synonyms and indicates related terms 
which cfut across the hierarchies in the classified 
schedules* 

A thesaurus/classlflcatlon systea is a aultlpurpose 
tool which can be used for the arrangeaent of boohs on 
shelves /because it includes notation/ and for conven- 
tional classified catalogues* Using the descriptors it 
may be used for poet-coordinate systems and for compu- 
ter searching* 



3* What is the role of a thesaurus? 



diyeet use in information and retrieval systems 



a/ For controlled retrieval languages* Provides list of 
controlled descriptors for indexing and searching* It 
also provides generic and relational infQsmatlon which 
allows for generic posting at the indexing stage, and 
mahipulaUon of tbs <iuestida at tbs sssrching stage 
/i*e* tbs searoh may be mads broader or mors specific 
by use of hierarchical relationships shown in the the- 
saurus - and more exhaustive by drawing upon related 
terms/* 

b/ listurid language systems* Thesauri oan be used to ; 
suggest alternative terms etc* in compiling searoh for- 
Qulations in natural langi age* A different type of tbe- 
aaums may be required fo: fres-text searchingi but 



no researoh has been dons, as yet on this problem* 
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in development of scientific Information * 

T]ie listing of scientific teims showing structural and re- 
lational Information reveels Interdlsclpllnazy reletlon- 
ships. Unfortunately thesauri are always lagging hehinl' 
scientific developments. 

4 . Uethode of construction 



- Methods of complllpfi thesauri 

There are obviously maqy: this is s personal view, 
e/ Definition of sub;ject field. 

b/ Classification of sub;ject field Into main groups or fa- 
cets. 

0/ Collection of concepts /descriptors/ In each sub;ject 
group or facet. Terms obtained from the literature, from 
other thesauri and olasslfioatlon systems, dictionaries 
etc. - also from subject experts, 
d/ Tabulate data on each concept. Pind synonyms, etc. 
e/ Arrange concepts In detailed classification within each 
main subject groupi this will assist In distinguishing 
hierarchical and other relationships, 
f/ Arrange terms alphabetically. Insert use entries and 
cheek that BT/ST and BT entries are reciprocal. 

- Possibilities and advantaaea of automation in ccmpilatlon 
cf thesauri 

The advantage of automatic compilation is that it ensures 
full reciprocity and avoids errors. It also takes much of 
the manual drudgery out of thesaurus compilation. Automa- 
tion should also facilitate 19-datlng And allow frequent 
new editions. 

5. Conditions which desori p to r s and 
thesauri must fulfill In order to 
ensure their Icter-branoh and 
Inter-language correlation 

Inter-branch oorrelatlon can best be achieved if the^ the- 
saurus is oompiled with the assistence olasaifieatien sys- 



of thesauri 
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ten. This will reyeal tbat oat^' concepts have applications in 
more than one subject area snd may fit into seyeral hierar- 
chies. 



6. Conditions which must be 
fulfilled by descriptors 
and thesauri as tools for 
further derelopnent of 
information 

a/ Thesauri must be easily up-dated. They must be hospitable 
to new material. Technology is changing so quickly that 
thesauri can never keep up. There should be frequent new 
editions. This should be less expensive with automated corh- 
pilation. 

b/ The change to natural language retrieval systems /which is 
likely to come about with increased mechanisation/ may de- 
mand a new type of thesaurus. Uore synonym dictionaries 
ma^ be requiredy and at the same time a larger number of 
specialised thesauri - which must be rapidly \ip-dated. 
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SOKE QBSSBTASIORB ON SESSANETIB FBQBI81C 
Rolf Jaxuien^ 

1. Raflnitlons 

Concept I Mental idea of aaterial or inaaterial object baaed on 
conon cbaracterlatica which are usnal3^ formed bf abstraction 
and fonnd identical. 

tmrm t But gl^n to a concept ani conaletlug of one or sore 

words. 

PeaeriPtor i UnlTocal repreaentatiTe of a concept in a 
doeuwentation aysten. The deacrlptor can bo a fixed term 
/■preferred tea"/ or any other designation, 
i fl^^astiaa : Nor porpoeea of inforaation atorage and atrleval e 
theaetpffna is an orderly cowpiiation of coaepta 
o repreaented by as many aynonywons teiws as poaalble in om 
or sore languages , 

o in rtiicb homonywona terms are specially marlcad» 
o in which a descriptor unlToeally represents a concept, and 
o in rtiich aemantlc ralatiotuahlpa between concepts aie ragiatered. 

Relationahipa oanbadariTad from tto definibiona of the concepts. 

2. Structural ©lamenta 
of a thaaau r u a 

Bsxore datermining the organiaation of a thesaurus and its 
atmctural eleaenta, it will he ueefnl to consider the principal 

* Badlsche~Anilin- a Soda-?abrik AG, AI/hokuMutation, C 6 67 
Ladvigehafan/fihaln. 
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rolo of o thosftUTUB in docuaBotatlon* Tbe naln object of a 
thesaurua la to facilitate infozmation retrieval. Therefore 
foUoaltig troiaa holds: 

A docvsastatiOQ aeaVch asks for concepts rather than words. 
The Q,uestionsr hsa an idea of the facts wantedi but the way in 
which the facts are ezpresaed in a document or file does not 
usually Batter nuch. Any arbitrary formulations in the docu- 
ment or inquiry must therefore be eliainatedi i.e. reduced to 
the level of the concept. 

Concepts as elements of information retrieval are therefore 
the baaic elements of a thesaurus. Because concepts are expres- 
sed by aords and a number of aynonymous terms may stand for a 
single concept t the conceptual level must be specially identi- 
fied. Other measures that enable terms to be clearly assigned 
to concepts include: 

o Homonymous terms should be identified. Short additions in- 
dicating the different meanings will do sway with homooyml- 
ty and establish clear asaignmants. 

o ?or retrieval purposes it is convenient to use a designa- 
tion that univocally represents a concept and is called a 
"descriptor". It is of minor Importance whether this des- 
criptor is s preferred texmf s concept number or s syste- 
matic notation^ although of course s notation iacoinronlent 
in that it indicates not only tbs identity of s concept, 
but also its blarsreblc relationships with other concepts 
Tbs basic information containsd in s thssaums includes 
not only means of identifying concaptusl levels, synonyms 
sad homonyms, thsir assignment to concepts, and the deaignations 
of deacriptora, but alao the aemantic Mlationshlpa between 
concepta, especially tba hierarchical ralationahipa. They form 
the 'eaaential framework in aqy organisation of concepts and play 
a daciaive part in information ratrieval. kvery question is 
bound to incorporate hisrsrehicsX elements, either becsnss nar- 
rower concepts have to be tekan Into account or becauaa the 
Inquiry is re?sti/ely broad, i«a« starts at a higher level to 
avoid lease . 

As regsids tbs eonstmetion of tbs hierarchy it should be 
noted that conceptual reality ia poiyhiararcbic in nature and 
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maolfesta Itself In netlUce connections. PolybierarchX means 
that a concept can he assigned not only narrower concepts but 
also any number of broader concepts. 

Other useful Information In a thesaurus Includes foreign- 
language equivalents, different spellings, definitions of con- 
cepts or explanations, lmport*ant sources of conceptual Informa- 
tion and aspects of subordination of concepts. 

An example of a thesaurus structure Is described below. It 
has been prepared at BASF and adopted by H>C Internationale Bo- 
kumentatlons— gesellschaft flir Chemie^. For each concept all 
pertinent information is registered together In a specific 
sequence. To enable the Information to be processed ty computer, 
every Item starts with a symbol Identifying the various entries. 
The total amount of Information for each concept is called a 
"concept set". The various entries In the concept seta of the 
mn Thesaurus are listed In Table 1. 

The concept entries were msds as vsrsstlle as possllAe to 
snable ell details available to be readily Incorporated without 
loss of Information and to provide for any future extension to 
new fields of activity. The first entries axe Intended for the 
registration of synonjmouB terms In different languages. At the 
end of each concept set ere the directly related ooncepts,alih 7 B 
represented by one of the aynoinymous terns. Polyhisrarehlesl re- 
lationships exist where a concept is assigned more ttmn one 
broader concept. 

Tbs middle of sny concept set is xesexnred for additional in- 
formation, siich as definitions and sources. Under entry 
each concept is assigned to one or more concept fields i^ch are 
based on sapects of concept oatsgcrlss and snable oonespts to be 
preordered by subject groups as tbs thesauma grows, isaipasnt 
to concept fields is by means of predetermined symbols consis- 
ting of tvo capital letters /s.g. >0 « properties, optlcal/«<bs 
TS3C Thessnms covers about 40 concept fields which srs di^ortant 
In chtmlc^T documentation. Flgore 1 shove some concept sets ta- 
ken from the concept area "Optical Properties". 
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Beerlff abene nnunsen 


Concent Terms 


B 


Benenoangen In Deutach: SynooTme uud Quasl-SynooTme 




Tams In German: Synonyms and near-synopyrns 


C 


Benennungen In Beutsch: Unterschledllche Wortfomen und 




Schrelbwalsen - Terms In German: Different Forms of 
the Same Word and Different Spellings 


E 


Beneonungen In Bngllsch 


- Terms In English 


? 


FransBsisch 


- French 


I 


ttallenlsch 


-> Italian 


J 


Spanisch 


» Spanish 


K 


Iflederiandisch 


-> Dutch 


,P 


Fortnglesisch 


- Portuguese 




Zus 8 t 2 liche Information - 


Additional Information 


B 


Begrlffsdeflnitlon » Definition, "Scope Kote" 


H 


Hlnwelse zur Benutzung * Instructions for Usage 


Z 


Begrlffsgeblets-Elntellung /2- 


-stelllge Symbole/ 




Concept Field /Two*dlglt Symbols/ 


Q 


Quellen->Kurzangaben - Coded Heferences 




Beziehutucabeeriffe 


Belatlon Concents 




ttbergeordnete Begrlf f e : - 


Generic Concepts: 


0 


Oberbegrlff 


Broader Concept 


S 


Terbandsbegrlff - 


Total Concept 


X 


Besugsbegrlff 


Beference Concept 




Untergeordnate Begrlf fe: - 


Specific Concepts : 


u 


Unterbegrlff 


narrower Concept 


T 


Tellbegrlff 


Part Concept 


Z 


ZugehdrlglceltBbegrlff 


Accompanying Concept 


G 


Gegenbegrlff 


Opposite Concept 


V 


Venrandter Begrlff 


Belated Concept 




/Begrlffasatzende 


Concept Set Ending/ 



Tab* 1 AngabeidEategorlen der Begrlffsaatze dea IDC-Theeaurus 
Table 1 Bntriaa of "Concept Sets" In HX/ Thesaurus 
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Computer m,ethod8 in 



thesaurus 



computer methoda of processing thesaurus information millte 
described below with reference to the IDC thesaurus. 

-Ehe concept sets are punched on cards and fed to the compu- 
ter. Which assigns a serial number /-concept number-/ to each 
concept set and carries out various checking operations such as 

mhethsr 

/I/ auy terms fed in are already available in the thesaurus, or 

/2/ any terms specified as relation concepts have themselves 
l»en defined as concepts, i.e, occur under the categories 
for synonymous terms in another concept set. If not. the program- 
me adds the missing concept together with the appropriate re- 
ference and a concept number. Otherwise the computer checks 
to see whether the reciprocal relationship is available a^ 
adds it, if necessary /"mutual completion of concept sets /. 



Concepfc fsd lu 
S fuel 
t) fuel oil 






Check 

£ fuel oil 
0 fuel 



Ihe checking programmes ensure thst all semantic re^tion- 
shlpe are completed and aU terms, including those in the ca- 
tegories of concept relationships, occur as synonyms of a con- 

“^^ihe completed information can he printed out in ordered ar- 
Pangement, classified by concept groups and completed »Jth an 
alphabetical list of the synonymous terns specified in e 
Jot section. Pig. 1 is a specimen of this section, while Plg.2 
muatrates the alphabetical register. The computer 
aavo the trouble of setting up and maintaining a card index 
which in the initial phase would have to be continuously Impro- 

A^di^Idvintate of thesaurus organliation is that the rela- 
tionships given for each concept always 

archical level. Therefore a computer programme has been deve 

: .20 
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ANCAStNMTeCORIE K / EC 



SUTE ^ 



00XSC6 6 OPTISCHE AKTIVITAET 
B OPTISCHE DREHUNG 
6 CPTiSCHE ROTATION 
B OPTISCH AKTIV 
E OPTICAL ACTIVITY 
E OPTICAL RPTATORT power 

C VERHOECENf OIL SCHWlNCUNCSCfiENE LINEAR FOLARISIERTEN LICHTES 
OPTISCH lU CFlHEN 
K EO 

Q AlC 2*592 - RCl J«7 - CIS Ztt AM - PX 39. 75 - QH A7*19*A - 
HW 3/Z.f29 « RI 36iUf AL, 2X3, 2X9 • ROE 5s360G - $6 U * 



$TLi2 3AS - UL 13. AE 

X STERECCHEMIE 000372 

U SPCZIFISCHE ORCHUNG 000A36 

U NOLEKULARES CREHVERNCECEN QC16,93 

V LINKSOREHINC 0C159A 

U RECHTSCREHENO 001595 

Z NU7AROTA7ICN 001596 

2 ROTATtONSDISPERSION 001597 

V HOLEKUEL-ASyNMETRIE 001525 

V CIRCULAROICHROISKUS 001599 

V OPTISCHE ISONERIE 0C2169 



OCX 597 B PARBAENDERUNG 

X EC 

U THERHOCHROMIE 000425 

U PHCTOCHftOHX E 000426 

U PIE20CHRON1E C00422 



00X562 B PHOTOLUNlNEliZENl 

£ PHOTCLUNIKESCENCE 

C AUSSENCUNG VON STRAHLUNC ANCERECT CUNCH ABSQRBXCflTES LXCHT 

X EO 

0 ABC 2.620 - CHS BstAl - 6N 47.5X.A - ML 729 - NOE 6s2*37B0 

0 LUKINES2ENZ OC042C 

U FLW0RES2EMZ 000095 

U FHCSPHORESZENZ 000431 




00X563 e CNENILUNINESZENZ 

e chenolunxnEszenz 

C CKcNILUNIN£$CEN2 
E CHENILUNINESCENCE 

0 AUSSENUUNC VON STRAHLCNC WEIT UNTCNHALB CEN OLUEHTENREIUTUR 
INFCLCE CHEMISCMER UNSETlUNGEN 
X EO 
X EC 

€ ABC 2.B20 - AC 65/972 “ CIS 224 - EhS BtCAO'- CN 47.3X.N 
- KOE 6«2.37M - UJ 14x326 



Pig. 1 ZDC ThosmusroBy amogeBent ty concept fields, within 
fields hj ascending concept nuabers 








21 - 



T HE SAUP US- AUPHAtCT-H C€ 1ST ER . BRKQ-WR * C<1<» 






RT C9U^« i fHOTQCI*LCRlERURC 
EO QOO^Zt . * • PH010C^PCM1E 
EO C00A2B E RHOTQCMOMtSM 

iO 00042B C PHOTQCRPCPUnil 

PT 0C2341 B PHOTQOlPEPlStEPUKC 

sr COC221 E' PKQTQCPAPHIC OEVELQPEA 

Sf C3l2d) t PHOIOCPAPHIC EPiLSlQR 

2Z C0C22O > PHCTQ6PAPHIE 

SF 00120) • PH0TQ6PAPH1SCME EMUUlCk 

Sr 000221 C PMT06PAPHUCME ENTHlCiaEP 
SP 000221 B PHQTQOPAPHISCHEP ENThSCUSP 
tl 001C02 B PHQTQfiPAPHlSCHEB Mil* 

PS unOl B PMOTOICNlSATlOli 

BT OOnOl B PHOTOICBISATIOR 

PE COnOl B PHOTOIONISIEPUBO 

BT 001701 B PMOTOlCNlStEBUM 

BT 0914TB B ^mOTOlSCHEBlSATtCM 

BT 0916TB B PMIIOISCPEBtSlEPliM 

UR 000520 B PHOTOBCLCBlNETBtE 

EC C01S42 E PHOTOlVMlNBSCEBCt 

EQ 001542 B PMOTOLURlNESlSRt 

OB 0014T4 B PNOTOIVSE 

PI 001161 t PHOTOLTEE IM UUEESUWUIE 

BE 001414 E PHOTOLTStS 

UP CCOSIB B PH010RE1P1I 

UP C0B51B S. PHOTCRSTBV 

PT COITOB B PNOTONlTBOSlfBUNC 

BT 001740 B PHOTOPCLTPEBlSATlOl 

2P 00174% B PM010Pfl5.tHiBfUTlOM 

BE 00C54T B PNOTOBIMTIOB 



Vlg* 2 Alphabetical lodez cf X € Xheaeune 
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loped at BASF which on the basis of the relstlonshlp entries 
etk^bles the tbesaurua concepts to be printed out In the usual 
hlerv'Jircblcal arrangement. The Individual concepts, represented 
^ concept numbers and sjnonymous terms, are listed sjstema— 
ticall^ in separate lines, hierarchical levels being Indicated 
bj indentation. Capital letters preceding the terms indicate 
the different types of subordination /cf. Fig. 3/* 



4. Use of a thesaurus in 
a computerized documentation system 

There is yet another aspect to the computer processing of the 
IBC Thesaurus. The thesaurus is an integral part of a computeriz- 
ed docamentation system^ for the storage and retrieval of con- 
ceptual information. The alphabetically-sorted thesaurus tape is 
used for computer classification of terms that have been fed in. 

A brief outline of the procedure is given below. The concep- 
tual information which represents the contents of a document Is 
written as clear text on coding sheets /"lndexlDg''/.The wording 
of the concepts is free within certain limits governeci by rules. 
The newly-fed terms are alphabetically sorted and then matched 
against the alphabetic thesaurus tape. All new terms are printed 
out and pass to the thesaurus specialist for checking > editing 
and incorporation into the thesaurus. This is followed hy compu- 
ter checking, up-dating and arranging operations. By comparison 
with the up-dated alphabetic thesaurus tape all terns on the 
tape can be replaced hy concept numbers and later hy hierarchical 
notatious and thus standardized. 

The procedure is illustrated in Fig. 4 in a simplified flow 
chart which showa two cycle processes, the storage and thesaurus 
cycles which meet where the two tapes ar^v correlated. It will bo 
seen that the IPO Thesaurus is an open concept system which is 
kept constsntiy up-to-date owing to the automatic feedback •11.1 
improvements in and additions to Che hierarchical structure can 
be subsequently transferred to the entire file. In tnia way,not 
only is the Information kept np-to-date, but also the system can 
be adapted to futm^e requirements. 
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Figure 4* IDC Pocuaentation of Concepta 
Siapltfied Flow Chart 
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Iha fact that there is no fixed ▼ooabaXaxy is also an ad- 
vantage. fo cut out errors In foxnulating the contents of do- 
cuments » partlcularlj apt texaa can he used and na« concepts 
can he fed In without ai^ delay, fhe thesaurus being multilin- 
gual » indexing is independent of language. Clasaificatiou and 
coding of concepts have to he done only once, and then .they 
tats place automatically with always the sane reeulta. 

5 ; General pram is e • a n d p r o h 1 e m a 
of thesauri building 

5*1 **Pre-coordinated** concepts or splitting into **post- 
-coordinated** eonoepts 

One of the aore serious prohleas is the recognition and 
handling ot compound or **pre-ooordinated** conoepts.JBIy oomponnd 
concept ±si neant a concept which can he mentally split up into 
separate concepts, the mental addition of irtiloh is certain to 
lead hack to the initial ooneept. Accordingly, a single inde- 
pendent and unahbigous concept is to he looked upon as the | 

smallest ooncaptnal unit whose farther msntsl sepsratlon is 
poiutlsss. 

Tbs problem sppsxently is thst it is not pcssiblh to find 
an abaoluta standard for defining s dsmarostion botwson sinils 
concepts end compound oonoopts, tbs daf inition of damaroation 
depandlng on the user’s point of view. In doeuaantation tha 
dtfinit^nn dapands .upon tbs sofuisemsmts of inf osMtion retrls- I 
•val. 

In s systam of oonoopts the ratio hatwoon hisranday and 
coordination is dstorynsd ly tbs dagroa of aubdiwision of oo»- 
eepta. Coneapta which aro disaaotad do not *«c^t In the Mmh 
aauros as snoh. dynonyms, dafinitions sad bdosarebiosl rale- 
tionahlpa cannot ha ragUtarad* biaaaatlon of ataoepta iatar* 
faraa wlt^ tha hlararehy of the elaaaittaation ayatami* 

Baoanaa diaaaetion Inflnianeea Indamlng and t h sa sn r ua bnU- 
dingf It ia naeaaaaxy to ostaUifh anrtain rk^ 
aoooant tho algal ficaaoo of iusrarohy ton tbs ordering' of aon^ 
aopts and far Infarmstion rstrisvsl. On tbs alter bend li is 
Impartsnt to oonsldor wbsthsr boiiyoiiad eomeeptSfif snyt should 

r: - . ' • ^ '"’' 26 









I 

I 



•1 



- 26 - 

be entered into both file and tbesaurus if the combination of 
their basic concepts ^post-coordinated** concepts/ can be stored 
just as unambiguously and appropriately. In ary case a clear 
distinction should be made between the menfca.l dissection of con- 
cepts and the purely linguistic dissection of compound terms or 
words • 

Under the indexing rules of IDC the basic policy is to dis- 
sect concepts only to the extent that the original conceptual 
connection in the context is not lost and there is no change in 
meaning. Each of the single concepts, when considered out of 
context, should have professional information value and be de- 
finite in scope, 

5,2 Treatment of synonyms and homonyms 

The question of conceptual unity also appears in. connection 
with synonymity, i.e. when it is to be decided whether terms 
should be registered as synonyms or quasi-synopyrns for a parti- 
cular concept or whether they designate different concepts. In 
linguistics the opinion is held that there are no true synonyms 
and that each term stands for a new and independent concept. In 
dociimentation, where the emphasis is on practical Tisefulness, such 
rules are too strict. In checking terms for synonymity the que- 
stion is not only whether differences in the definitions of the 
concepts can be detected, but also whether usage differs in the 
literature and whether any difference is relevant for informa- 
tion retrieval purposes. 

While synonyms are independent of context by definition, ho- 
monyms depend on the context and should therefore be identified 
as such during indexing. In the thesaxinis the best way to cha- 
racterise homonyms is to write a parenthetic expression behind 
the tem; this expression should contain a concept\ial limitation - 
in the form of an explanatory word or at least a number. In the 
IDO system the parenthetic expression causes the computer to 
issue a message when the thesaurus is matched against the new 
storage tape, so that homocyms may be recognized and brought into 
an unambiguous form on the storage tape. 

If thesauri of different scientific fields are to be compa- 
tible or used in different branches of knowledge., an additional 
overall check for homonymity is necessary. 
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^• 5 * Definition of concepts 

As additional Information on a cpnceptilts definition is of 
great value* At least In cases where relationships are difficult 
to Indicate f a definition should be worked out and entered Into 
thp thesaurus. 

In the Interest of practical documentation! concepts should 
be defined with general usage In mind rather than terminologi- 
cal standardization! however justified. Experience shows that 
most differences In the use of special technical concepts can be 
attributed to differences In the degree of abstraction of the 
underlying definitions. One author has a narrower, another a bro- 
ader idea of the meaning of a concept. Because of the. given 
scope of variation, It Is of no use to fix unduly narrow defi- 
nitions . 

Taking all this Into consideration, the following guideli- 
nes for a user-oriented definition of concepts result: 

o The definition of a concept should be as close as possible 

to the usage of the concepts In the literature, 
o In cases of doubt, the more abstract and comprehensive ver- 
sion Is to be given preference In order to avoid loss of 

Information during Information retrieval. 

5.4 Different kinds of concept relationships and their de- 
finitions 

In order to describe concept relationships It Is necessary 
to have an understanding of the different kinds of concept sy- 
stems and the different types of relation concepts. Stimulated 
by a publication by lUSTER^ the terminology of the types of re- 
lation concepts has besn completed and put. to practical use In 
the IDC Thesaurus. A distinction Is made between abstraction 
system /’*Abstraktlonssystem**/| whole-part-system /^Bestandssys- 
tem’V and attributive or affiliation system y^ZugehOrlgkelts- 
systmnVi each having two reciprocal types of relation con- 
cepts. 

Abstraction systems are characterised by broader concepts 
and narrower concepts. Narrower concepts have all the characte- 
ristics of the broader concept and In addition at least one li- 
miting characteristic. 
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fhol»»part syateaB Indicate the relation hetveen an entity 
and ita parts and are described by total concepts and part oon- 
cepts. fart concepts are arrived at by the nental division of 
a whole /total concept/ Into its parts. 

Szaoplet ^*body*^, ^ctaassls^and ** engine*^ are part concepts 
of the total concept ^^autoaoblle*^. 

Attrlhntlve or affiliation svstens are described lij refe^ 
renca concepts and accoapaqylng concepts. Acco^aQying concepts 
are closely related to their reference concepts, hut dc not 
coincide with thaw in tbair characteristics /no eibstTact relstion- 
shlp/, nor can they be foxned by a division or dissection of 
their reference concepts /no whole-part relationship/. 

Szawplesc **Catalyst** is an accoopanying concept cf**eatalyBls? 

**hlstUling colnan” is an aceonpanying concept of 
MdisttlXation**. 

Xhese ezaoples show that the attributive systea includea la- 
portant . cross-ref erences between different concept categories, as 
between a process and the function of a naterial, or between 
operations and apparatus, while in the abstraction aysten rela« 
tionships are restricted to one and the sane concept category. 

In the aysten of cooBepta, abstract relations, whple«part 
relations and attributive relations appear aiaiultaneonsly and 
penetrate each other. Sheref ore, if the order of tbs system Is 
to be readily understood and the organisation is to be convenient 
for retrieval purpoaea, it is important that the different types 
of relation eoneepta be identified and arranged in a logical 
seqiianee. 

Ihile in the abstraction and whole-part systems concept re- 
lationships result dlreet3y from the definition of the relation 
conceits, the direction of the relation in the attributive or 
affiliation system is obtained by logical interpretation cf , 
**affiliation*^ It is a characteristic of **aff illation** that it 
depends on and belongs to acmething elsei the meaning of **affi- 
listion**is that it refers to an object which can be as a 
reference concept. 
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u 


Broader Concepts 
Narrower Concepts j 


^ Abstraction System 


s 

T 


Total Concepts *] 

Part Concepts J 


^ Whole-part System 


I 

z 


Beference Concepts *| 

Accompanying Concepts j 


I Attributive System 


G 


Opposite Concepts 




V 


Belated Concepts 





Table 2 yjw^w of relation concepts with abbreviation syn- 
bols of IDC Tbssauros 

With eonpound concepts tbe situation is aoro cosiplioated* 

A general rule for tbe assignment of concept relations 1 b de- 
rived from tbe fact that eonpound concepts bsve been formed ty 
either ”detemination** or •• con junction** or "disjunction** or 

*» integration** /cf. 1/. Most common is tbe formation by detemi- 
nation» i. e, limitation of tbe original concept by tbe addition 
of s supplementary characteristic to tbe content of the coclginal 
concept. In this case the beslc word in the coapound word is the 
broader concept. The explanatory portion Is the reference con- 
cept or, if the relation between the e^lanatory portion and the 
compound word does not seem to be relevant for retrieval purpo- 
ses, tbe related concept. 

Sxamples S Oxidation catalyst 

0 Catalyst 
Z Oxidation 

Therefore t X Catalyst * ondation 

U Oxidation catalyst Z Oxidation catalyst 

It is, however, not always easy to diatinglnsh between 
single concepts and compound coicepta and if pre-coordinated, 
tbe type of formation is not s3 lays apparent. 

Concepts for which it is difficult to determine tbe direc- 
tion of tbe relation are characterised as **related cooeepts**. 
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These are concepts that cannot he assigned to aoj of the other 
relation concepts and whose cross-relationship often consists 
In a somewhat remote mental association. 

An anresolyed problem is how to deal with overlapping. In 
the IDC Thesaurus overlapping is Indicated by special referen- 
ces. 

6. Outlook 

The only way to avert the impending disaster in Information 
handling Is through International cooperation. What Is required 
above all Is agreement on the methods to be used. Efficient and 
future-oriented methods of documentation must be developed. It 
Is being realized more and more that the efficiency of a docu- 
mentation system depends upon the perfection of Its conceptual 
system. 

A thesaurus seems to be the best means to compile concepts 
together with all pertinent Information and to generate a poly- 



I 

hierarchical system of concepts. The consistent use and precise j 
representation of concept relations determines the quality of | 
the thesaurus and Its reliability In Information retrieval. The | 
problems of finding the right methods for constructing a the- | 
saurus and the many decisions which are connected with the study | 
of special concept fields have only been briefly dealt with In j 
this paper. \ 
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1. 1 descriptor is a chosen formalised texan, consisting of 
one or a few words caf a determined symhol, forming an elemen- 
tary part of the thesaurus, meant to represent in a uni\’-ocal 
way the determined subject content, of equivalent toms or 
groups of terms* 

2. Ascriptor — a term or qualification largely synonjnous 
with the given descriptor* Ascriptors in the information re- 
trieval system are replaced by corresponding descriptors, while 
in thesauri they indicate corresponding descriptors by means 
of a system of reference marks* 

3* A thesaurus is an orderly, arranged quantity of notions 
and feemini creating an opbn system of subject headings placed 
within the framework of a dcmiain or a probl^, classified, 
part of which consists of descriptors with indicative inter- 
dependences and their mutual conceptional relations. 
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structural elements of 
a t h e^s a u r u B, 

1 * General scheme of groups and subgroups .of descriptors. 

2. Index /may be tabular/ of descriptors taldog into ac- 
count their antnel hierarchical dependences, semantic and fhao- 
tlonal connections, ascriptors etc* 

rHSotrSTlnstitute for Scientific, OJechnicel and E:coomic In- 
formation, Warsaw* 
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3* in alplJibetlc Index of all teroa embraced by the tbesau-’ 
rue with oroae^referencee showing the correlation of the tesss, 
their flutuil connections and the connections between descriptuiRi 
and reapeotive aSecriptors. 

4« Peaeription of the syatea of cross-references applied 
in a thesaurtts* 

Hotet A list of terms with neither structural information nor 
information about their mutual dependences is not a thea- 
anrusf eyen if it has partly synonymous terms. 

A thesaurus should as a rule cover fully the given domain 
or problem for which it was created. However « analysing exis- 
ting thesauri and the literature concerning themt we can select 
two tendencies: 

1 » a thasaurua should he adapted to a definite eollectlon 
of information material and serve to make accessible the neces- 
sary information from the raapective collection without either * 
information "noise** or silence. 

2 — a theaauruB should fully eo(brsce the respaetive domain 
or problemt whether or not in ^a infoxmntional collection to 
which it i^' applied the materlala are complete. Such e theaaurus 
would help in tracing the It^ormatlon gape of respaetive ool- 
lectiona. 

The definition of a thesaurus is a problem which still de- 
maods discussion. It would be most advisable to define univocal- 
ly the conception of a thesanrusi for which as we know there 
are numerous definitions* The io^portance of differences in the 
way of formulsting may be here accepted sa negligibXei more 
important t even essential | are differences in the substantial 
understanding of this conception* A certain number of authors 
consider that a thasaurua is a kind of dictionary: ideological | 
or notional. In such cases the reguirements given to the thesau- 
ri are' United to: 

- obtaining e certain number of words of natural language 
chosen by enalynis of the subject matter of texts and systema- 
tised eccorrllv^g to an initially chosen classification systemi 

- ohtaj ling en indsx of desoriptom as s toul for epraoiss 
designstion of content of the necesswzy information and enab- 
ling coordinate indexing of docimMota and infozmatlonnX Ingn- 
iries» 
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- croatiDg an orderly collection of key voida bhoaen ty sta- 
tistical analysis of texts $ no aatter what subject* 

Tbs point of Tie* does not sufficiently explain tbs still 
growing importance that is attached to the thesaurus, eepeclaUy 
as to the inforination retrieval language in modem Infomatlon 
systems and in studlying modern methods of scientific informa- 
tion* There is a ^slc differerice be^een a thesaurus snd a dic- 
tionary* A dictionary is \ised to find mords and texmsi and a 
thesaurus - notions* We therefore cannot consider a list of texms 
which does not include structural Information about mutual con- 
nections snd dependences of notions as a thesaurus* 

In the latest definitione a thesaurus is described also as 
a classification system* This view is Interestingf but needs 
discussion* It is indispensable to designate irtMt additional 
conditions should be acoomplished by a thesaurus in order to be 
generally considered as a classification system independently of 
its main aim as an information retrieval language* 

It aeema that the reaaoaB quotad above sufficiently justify 
the need of continuing further dlaouesien* ' 

Our experiences end the analytical and reaaarch work eaxzled 
on while preparing a thesaurus of scientific informatioat show 
that certain elaasical methods are being foxmadt of choosing in- 
formational material for the thaaaurust of defining the eoheme 
and its tbcmetlc range | principles of woaddng out deaeriptora; 
the syetMi of oross-referencee* 

lech theeaums atiU haa apaoial featurea which distlngnieh 
it faram the others* This ie due to the domain or problmafbrwtaLoh 
the tbeeeurus bee to he built t end to the concrete conditieae in 
which it le worked out end applied* 

The building of e theeeurum should be preoeded by an eccu- 
rete deeignetion of the elm which it ehould serve end by tbe 
designation of available orlterie of the quantity of notions 9 
which were oolleotsd for building ths thessurus* Tbs methodolo- 
gy of building t^e theeeume deponde elmo on tbe accepted defi- 
nition of the theeeume* 

Tbe first stage of work upon m ^eeeumsi the operation of 
coUooting the optimal quantity of notions concerning the domain 
or problem is s clsssiosl stagsy snd them eaanot be ety poeei- 
bllity i of avoidiag it* . 



How varieo the methods of collecting an assemblage of no- 
tions may be depends, however, on the range of the thesaurus. 

- Considering that a thesaurus is adapted only to a‘desi,;na- 
ted collection of informational material, the quantity of no- 
tions may be formed by collecting key words and groups of words 
systematically from the documents of the respective collection. 

- Assuming that the thesaurus should embrace a given domain 
or problem, it is necessary to construct - already as the first 
stage of the work - a semantic-hierarchical scheme of this the- 
matic range. 

Contrary to the first solution, when solving the problem as 
above,' the operation of gathering notions and terms should not 
be based on one information collection, even on the best one. 
Such a collection and the materials embraced in it should be ana- 
lysed systematically and very thoroughly. The results should be 
compared with other local and foreign collections of materials,- 
with semantic-functional-hierarchic schemes of this domain or 
problem, theoretical ones as \7ell as those based on experience. 
It is also necessary to investigate the largest possible number 
::f information items - and retrieval systems of this domain or 
problem. 

When the work on building the thesaurus in conducted paral- 
lel to the process of gathering information material /in the 
range of the domain or problem/, a third solution could be ap- 
plied. It should be started with elaborating a theoretical 
semantic-hierarchic scheme. This scheme will be more theoretic 
than when building it on the basis of an. existing and systema- 
tized collection. Such a solution gives the possibility of ex- 
cluding, already at the design stage, materials only loosely 
connected with the basic thematics of the dcsaain.And such mate- 
rials are always to be found in evexy collection. 

Gathering an amount of notions and termini should be reali- 
zed on the basis of the natural, language, embracing in the col- 
lection entries specific to the specialized language of the do- 
main or problem and even if necessary terms originating from slang. 

When planniog analytical and research work on the fund of 
notions and terminological problems of the domain, the limits of 
thla domain should be precisly defined. 

It is especially important that the thesaurus should he built 
for the range of the domain, and not for the given collection. 
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The delineation of the scope of the domain for vhich the 
thesaurus should he built and the comparative analysis of ex- 
isting semantic schemes of this domain and of related domains 
will be of evident aid when delineating the thematic range of 
the thesaiirus. 

The solve the problem of designing the optimal way of ga- 
ining access to the collected notions and termini, one has to 
classify the collection and to range the notions and termini 
according to the categories of the scheme. The goal of this 
operation is to find out the gaps in the fund of notions and 
termini and to correct snd improve the scheme. 

During this stage of the work it is already necessary to 
define the principles of utilising the collected mass of 
notions and termini in the thesaurus, and especially: 

- the criteria of choosing termini to become descriptors, 

- the principles of descriptor building, 

- the principles of utilising synonyms, closely related 
words etc., 

- the principles of connecting descriptors with ascrip- 
tors. 

These criteria and principles should enable the elabora- 
tion of a collection of descriptors and ascriptors showing 
their relationships. A tabular display with descriptors sys- 
tematized according to the scheme of the domain or problem 
may be used here as an efficient tool. 

The deteimination of the method of alphabetic ordering of 
descriptors and ascriptors is to be done in the next stage cf 
the work. An arrangement in the form of s permutated index 
should be most effective as far as informative value is con- 
cerned. The system of cross-references shoi0.d correspond to 
the hierarchic relationships of descriptors and their intercon- 
nections with ascriptors shown in the tabular display. 

X do not intend here to discuss sll essential problems 
arising when buildi! the thesatirus - only the most essential 
elements* 

The popularity of thesauri as information retrieval tools 
is still rising. This is evidently s sign of their usefulness. 
Nevertheless the difficulties bound up with elaborating the- 
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saurl are great* Although the methods of building thesauri are 
still more preclsef It Is impossible to build a thesaurus fol- 
loariog precisely these methods: each thesaurus has its own 
problems I and demands modification and complementation of the 
method. In order to master all these difficulties it is neces- 
sary to collect experiences. This may be done by means of pub- 
lications devoted to the direct exchange of know-hoe and of 
experiences concerning the 'theory of building thesauri as wall 
as their practical application. So far^ experiences on the ef- 
ficient use of thesauri as ratrlaval tools have been amassed on 
the basis of a va^ limited number of thesauri. Such experien- 
ces help to evaluate tha thesaurus in its role as: 

- a vocabulary of notions enabling the correct indexing of 
docxmentSf 

- an information-retrieval tooli . 

- a clasalflcation system. 

It seems also of Importance to discuca how the development 
of thesauri will influence the develo^ent of scientific infor- 
mation theory. It is also to be dasired that the development of 
the theory and principles of thesauri building will eventually 
reach the stage of universalization and internationalisation. 
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pPTiiCTPiSS OF TBSSAWJ BUXU>IHGr 

K. Uakl* 



Th# eoBOapt of thosBuruB that I propoBO 1b IntBndBd to 
brace aU theBaurl, not only thoae buUt for doBains witii hl- 
jeafly Utfay-forBallBod languagea /aa in teebnologyi tbe exact 
sel^netB •te»/« 

X 

1 . concept of • tHesaurus as 

an ala • ant of • ratriaval ays— 

tax 

Tim goal of Iniildlng tlMsanrl la to ottain a unlvooal 
of qualifying the content of doouaentB and of their retrieval. 
She unlvooal character ataonld be achieved In what coneama tha- 
BBtlea atA organlaation on a poaalbly large acale. The range of 
thaaauma thamatica abould therefore be aa aide as posalbla.and 
It ahhrace all the oollaetlonB of natarlals froa thie 

dowln* 

9 ba optlnal tl»aat»ua ahould. .ttapaf ora - tbipratically - 
coavrise the iritole of honan knoaladge and ahonld be applied to 

aU retrieval ayataw. , 

it tJti praaant davalopaaint ataga thia la not poaaihla, for 
several reaaow. One could aay that it ia rather Ittely that 
suoh a atage wUl never be reached. While a necaaaary preciaion 
sad univocaX character of the teartnology, the identity or at 
the aanantic and fuactioi al alailarltT n® ta;*a exminad 

X Pocnsent«ti»p and Scientific Xnfomation Centre of the Boliah 
lontaiy of Beloneosi Wiuponw* 



from all points of view, will bo achieved, the knowledge of 
laws governing the living Isnguagea will probably increase so 
that it will be poaaible to organize the retrieval systems on 
different baaea and with the aid of tools more efficient than 
the present ones. 

The conclusion concerning the range that reape ctive the- 
sauri have to cover ia that they ahould comprise the rangea of 
sub je eta where the reapective elements of the thesaurus are 
looked upon from one point of view. 

Such ranges are practically represented by respective 
fields of science, and thesauri should therefore be adapted to 
their frameworks. 

There are no limits to the organizational reach of the the- 
sauri. It is languages only which may form such barriers: the 
relations between the notions in various languages that were 
formed in different cultural conditions may be differentiated; 
therefore translating notions from one language to another may 
cause either great Informational "noise" or silence. 

Within the frames of one language or one domain auch a 
danger doesn't eziat. We can assume in such conditions that a 
thesaurus ahould comprise the range of a full domain of science 
and in these frameworks, all the collections applying retrieval 
systems baaed on methodological tools worked out in the given 
language. 

The thesaurus cannot present an inflexible construction - 
stiff, closed and foasiUzed. It must be receptive to the inf- 
low of new notions covering new problems, be amenable to the 
removal of out-of-date terns, and be able to absorb changes of 
meaning in the descriptors. It must therefore present an opeu 
system. 

• The tbesa\irus must ascertain an almost automatic passing 
from terms embraced by a certain problem to, other problems con- 
nected semantically and functionally. 

Xt must therefore reveal the relations between the terms 
comprised /or between notions represented by these terois/. 

As result of this reasoning /obviously abbreviated here to 
fundamental elementa/ the thesaurus may be defined aa follows: 
"An open system covering a determined full thematic range con- 
taining an ordered number of tezms, some of which are admitted 
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as descriptors, showing the relations occuring between these 
terms and their mutual dependences". 

Such a thesaurus consists of 2 categories of tennss des- 
criptors and other terms. The thesaurus cannot exist without 
these tezms. 

Theoretically, we can assume that a thesaurus can embrace 
all the semantically-precised terms from a certain domain. 
Evidently such a multitude of berms would be too numerous and 
as a consequence such a thesaurus would cause a lot of infor- 
mation noise or silence, and above all its application would 
involve mucb difficulty. The thesaurus must therefore embrace 
only some such terms, and only those where the level of gene- 
rality will be high enough. 

Among these terms may appear some groups with semantically 
close meanings etc., therefore there won»t be aqy necessity of 
applying all these terms. It will be advisable to choose terms 
raprasantative for such groups. The method of building such 
‘berms cannot be quite free — they must be subordinated to some 
common rules. 

While applying these terms it is not possible to refer to 
the whole thesaurus each time in order to fiud their mutual 
relations. These relations, at least the most important, must 
be indicated directly by each of them. The most effective way 
of doing this is by means of cross-references. 

All the terms cannot be applied operatively while they can 
be met within the given thematic range, and therefore their 
roles must be determined in the thesaurus. Thus we coma to the 
definitions of fundamental elements of the thesaurus: descrip- 
tors and ascrlptors. 

A descriptor is a "chosen formalized tezm, built of one 
or e few words or a symbol, making an elementary part of the 
thesaurus intended to represent in a univocal way the deteml- 
ned subject content, with pointing out its basic relations and 
dependences with other descriptors and ascriptors**. 

An aacrlPtor is "every term included in the thesaurus, bub 
not a descriptor, being one of the synonymous terms, or one 
close in meaning in relationship to one descriptor comprised by 
the thesauxrus" * 
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2. Structural elements of a the- 
saurus 

The abov^etieiitloned way of comprehending the essence of 
the thesaurus and Its elements creates special requirements for 
Its construction* As a consequence | such a thesaurus should 
comprise: 

a* A general scheme of groups and subgroups of descriptors* 
b* A tabular display of descriptors! showing their mutual 
hierarchic dependences* their semantic and functional rela- 
tions | ascrlptors etc* 

c* An alphabetically pexmutated register for all terms ap- 
plied In the thesaurus I i ndicating their mutual relations by 
means of cross-references* 

Such a register may be divided Into separate lists of des- 
criptors and ascrlptors* 

Besides the above toms of presenting a thesaurus • the rela- 
tions between the terms applied In It may be presented by other 
methods* However a system which does not comprise the forms 
mentioned above will not be a thesaurus* 

^•Elements Influencing the. orga- 
nization of the thesaurus 

The abovementloned structural elements of the thesaurus 
determine Its external shape* Its sub;)ect content | and as a 
consequence Its organisation! depeqd on: 

a* The thematic range comprised by the thesaurus and Its 
semantic; content* 



The tabular display of descriptors! realised according to 
the scheme comprising all dependences and relations of each 
descriptor /encl*1/! fulfills two roles simultaneously: of 
a material enabling a clear and full picture of these depen- 
dences and re Latlons and of the analytlca*' apparatus! con- 
trollin he building of the scheme of gr>.urs and subgroups 
of desc ^ptors* The subject division of the thesaurus can 
also be established In another wayi but the already^entlon- 
ed analytic method appears however to have many positive 
aspects; above all it e, limlnates various interpretations* 
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b* The asaunod degree of mlouteneas of the thesaurus «clos-> 
eXy connected with the method of operating the thesaurus. 

c. The overlapping of the thematic ranges in the thesaurus 
with other thematic ranges. 

d. The clearness and terminological precision of the the- 
matlcs embraced by the thesaurus, and as a conseq.uence the met- 
hod of choice and building descriptora. 

4* Building of the thesaurus 

A thesaurus ahould be built so as to ascertain aa full an 
analysis aa possible of its thematic range. Its thematic range 
ahould be therefore determined and presented in. the form of a 
aemantic-hlerarchical scheme. 

Such a scheme ahould then be filled with aubject entries 
collected on the beais of enalysis of documents entering within 
the renge of the plenned thsmatice, books and periodicals, pub- 
lished and not published, of iniiulrleB addressed to the collec- 
tion end of different types of analyses of freguenpy and moans 
of appoering in the documents of those entrieeii 

Filling the eeheme ehould be followed by correcting end 
improving the esteblished semantic hiorerchical aeheiae. A very 
importent atego in the building of a thesaurus is to work out 
which ranges it has in coamon with other thematic ranges. These 
will he the ranges through which will take place the flow of 
information between the thsmatic range of the respective the- 
saurus and neighbouring thsmatic ranges. While analysing the 
subdoct entries appearing in these ranges, it is necosaary to 
look at them from two points of view /or more/ of the overlapp- 
ing siibdeet ranges. Similarly, we should approach the choice and 
building of deacriptors in two or more ways, eatabllehlng their 
relationships etc. Great help can ho provided here by an accura- 
te building of tabular display of deacriptors. 



5. Language barrier 

In item l‘ I have mentioned the language harrier aa limita- 
tlng the reach of organioatlooal application of the thesaurus. 
Such barriera undoubtedly exist and overcoming them will always 



be difficult, but may not bo imposaible. 
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It is obvious that the semantic contents of respective sub- 
ject entries, and as a result of descriptors, sometimes differ 
from one language to another. 

The semantic hierarchical arrangements of these entries and 
descriptors, as well as of their mutual connections and correla- 
tions, the synonyms and homonyms etc. are differentiated. 

When passing from the system of a thesaurus built for one 
language to such a system in another language, it is necessary 
to consider carefully all these differences and create thesauri 
common for these languages . There will evidently be no possibil- 
ity of obtaining in this case a very high level of conformity 
of descriptors and their arrangement. It is to be assumed that 
the precision and minuteness of a multilingual thesaurus will 
be somewhat limited. 

The degree of conformity will here be the function of all 
these factors and probably a result of their given interpola- 
tion. 



6. The role of the thesaurus 

In Item 1 I limited the role of the thesatirus to the role 
of an element, a tool in the Information-retrieval systems. 

Such is probably the contemporary role of the thesaurus. But 
is It the only role? Such a limitation would not be right. A 
properly -built thesaurus is meant first of all to qualify do- 
cuments, and later to retrieve then. Sven at this first stage 
another possibility of utilising a thesaurus reveals Itself: 
analysing the content of documents with the help of descriptors, 
and iu consequence breaking up these documents Into microele- 
ments, classifying these microelements and determining their 
mutual correlations. The question arises of whether the the- 
saurus we are building now will perform the role of an Instru- 
ment for such an analysis, or whether another type of thesaurus 
will originate from the objective systems of documents In the 
given domain. 

7. The correlation of thesauri 

According to the above reasoning, it seems that there is 
the possibility of a free flow of Information from one thematic 
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range embraced by one thesaurus to another thematic range .Such 
a result may be achieved through: 

a. Agreement on understanding the essence of the thesatirus. 

b. The unification of requirements from a thesaurus | concern- 
ing necessary struct\iral elements for Its existence and vays 

of building thesauri and descriptors. 

c. Paying particular attention to overlapping thematic ran- 
ges common to Interrelated thematlce and examining the descrip- 
tors eTibraced by these ranges from all possible points of viev. 

d. Making thoroxigh comparisons of thesaxirl systems with the 
same /or similar/ thematic ranges, but In different languages 9 
and building bl-llngual or multilingual thesauri. 

Satisfying the above conditions should ascertain the corr- 
elative character of newly-built thesaiiri. 

In order to achieve a similar result In relation to existing 
thesauri, It is necessary to analyse thoroughly their ranges 
common with other domains and thessxirl - If possible also adap- 
ting descriptors crested for these ranges. 

Though the new possibilities of using thesauri ss Instru- 
ments for analysing the contents of .documents /Item. 6./ do not 
seem to have a direct Influence on their mutual correlation 
this aspect of the problem deser/es close attention nevertheless. 
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R3gMARKS ON TEE GENERAL HIINCIFIES OF TEB8A0RX BTULDINQ 
Imre Holtiar^ 

1. Hierarchical and alphabetical 
thesaurus 

Ordered collections of the terms of himan knowledge, idiich 
contain an enumeration of concepts, as well aa their interpreta- 
tion and relations, are named thesauri. This definition includes 
a few elements which need further explanation. 

1.1. Ordered collection of terms 

is shown bQT the title, tbs appraisal of human knowledge can 
be realised bj collecting the terms, on the one hand, and the 
sTStematization as a condition for a collection of terms desig- 
nated as a thesaurus, on the other, i thesaurus may be construc- 
ted in several ways, using different principles of arrangement. 

A thesaurus compiled for a certain purpose owes its particular 
structure to tbs general relation between structure and function. 
Besides, the developmental stage of the thesaurus Is also a fac- 
tor determining the structure. A thesaurus haa differently struc- 
tured forms in the course of its development . In general In might 
be said that the first and fundamental form of thesauri is the 
hierarchically-arranged structure. This fo]% reflects the aubor- 
dlnatlon of science. The hierarohlcally^-ordered thesaurus gives 
an Interpretation of the correlation of terns in a particular 
branch of science, and makes possible an appraleal-ln-depth 
relating to apeclai questions of different scientific problems. 



^ Library of the Hungarian Academy of Sciences, Budapest 



I 










46 



It a thesaurus has the function of gathering all possible 
concepts of a discipline, the hierarchically-ordered form is 
best suited to this task. 

A thesaums considered as an instrument of information 
systems can hardly be effective in the hierarchical arrange- 
ment. The hierarchically-ordered list of teiras is not suitable 
for information analysis, indexing, storage or retrieval of 
information. These problems may only be solved by alphabetical- 
ly-arranged thesauri. The hierarchical structiire of a thesaurus 
compiled for information noric is only the first step to the 
final, alphabetically- arranged form. The hierarchical thesurus 
remains, of course, a necessary instrument of a complete the- 
Aauri-system because only this type of thesaurus is able to 
give edaquate answers to such questions as theses 

Are the terms of a branch of science sufficiently detailed? 

Are the markings of different correlations of terms sui- 
table? 

Does the thesaurus contain synohyms, hcmonyms, etc. in a 
necessary quantity? 

Does the thesaurus coutain s certain new tern? If so, what 
kind of relations does that term already have? 

Thesauri which are used in the r;»atine work of information 
systems and are often born during continuous inf ormation’ work, 
are employed in alphabetical form. By its mechanical arrange- 
ment, as well as by its codes for relations of terms and by 
its homonyms separated and synonyms connected by code signs, 
the alphabetically-arranged thesaurus provides an efficient 
iiBtrument for the indexing of documents, for. information sto- 
rage and retrieval, and also for standardiaation of infomatloD 
queries . 

There are mai^ posaibllities of marking the hierarchical 
level and semantic relations of included terms in the alphabe- 
tical thesaurus. The codes connected with each term of the th^ 
saurua are able to represent ami mark the class of terms which 
the actual term belongs to. 

This la ahofwn by a detail of a biochemical thesaurus • 
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Hierarchical Alphabetical 

arrangement arrangement 



12 


Protein 


Debydrogenase 


1211 


121 


£n2yme 


Snzyme 


121 


1211 


Dehydrogenaae 


Lactate-dehydrogenase 


12111 


12111 


La c ta te-debydroge naae 


Protein 


12 



Both the code number of every term and the number of digits 
of a code number have a definite meaning relating to the hier- 
archy o£ tenis. 



1.2. Human knovledge ^ 

Thematically, the main types of thesauri are: gec.<iral and 
special tnesauri. The type of a thesaurus Is thematically de- 
fined the else of tl» branch of science which the concepts 
belong to. The limitation of different disciplines end c£ their 
terms seems to be one of the most exciting q^ueatlons of the- 
sauri building. 

The completeness of the uexios of a discipline is always 
relative, the limita of a branch of ecience cen be differently 
appraised in depth. 

1 . 3 . Interpretation of terms 

Svery tem in a thesaurus must have its exactly— definad 
and characteristic meaning. Therefore, terms must be interpreted 
exactly. This interpretation may be different in aize: it may 
include only the meanii^ of the terms but it may also contain 
synotyms, related terms, etc. It is also advisable to interpret 
whether e term is a deacrlntor /indexing tenn/ or only a 
,’ )ect-beadlng /no indexing term/. 

The criterion of unanimity requires, in the first place, 
the separation of homonyms. The use of brackets Is a frequent 
method of making the nepessary distinction between the diff- 
erent meanings of a homonym. 

1.4. Harking of relations 

Semantic correlations mark the hierarchical level of terms 
in a thesaurus. T)» relations lead from e generic term to a 
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specific one and vice versa. !Tb 0 most frequently u&ed pels- 
tlooB are as follows: 



!?he sufficient marking of both the hierarchical level and 
the relations of tenis In a thesaurus ensures that the requ- 
ired depth la the Indexing process is achieved. 

2. Thesauri In Information 
systems 

Information systems and the Indexing process frequently 
result in a thesaurus. Bowever they always have a significant 
feed-back to che thesaurus. An Information system can only 
be effective if It corresponds with the continuous and rapidly- 
changing Information needs. This is why ahr one branch of sci- 
ence needs a special thesaurus for the use of a certain instlb- 
utloui for the different depth of Information queries. A re* 
search problem of a discipline may be of more Importance In 
one particular place than In another. The dispersion of \isage 
of certain terms Is the greatest In borderline scleiicea. An 
Institution which studies a problem regarded as an Important 
one analyses Its terms In a more detailed way than another 
which tends to Investigate this problem only perlpherloallyi 
and uses Its terms only la a more general break*dawn« 

This situation leads to the atoo^zatlon of the methods and 
techniques of thesaurus building. And this atomization has no 
negative character In general* Per all the Institutes i research 
laboratorlest enterprliesi etc* It is a duty and a necessity 
to build their own thesauri whloh must oorreapond with the in- 
formation needs in their own organization* 

This d ^ ;lopment offers fewer azvl fewer ..opportunities for 
thesauri \ Ich tend to cover the entire range of human know- 
ledge. The rate of the produotlon and specialization c£ infor- 
mation makes It more and more urgent to build thesaurli rela- 
tively narrow In volume and analytically deep in appraisal* 



Broader term 
Narrower teim 
Related tem 
Use 



/BT/ 

/NT/ 

/HT/ 

/USE/ 

m/ 



Used for 
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These thesauri naj he ccnplled aiil structured la a variety of ^ 
depths 9 sleesi forast etc* 

On the basis of the abovenentloned arguments , all the 
principles, facts, methods and techniques elaborated and made 
usable for thesauri building are today very valuable and ne- 
cessa37. I^eparatlon of data, methods, techniques and designs 
Is the greatest help experts of thesaurus research may give to 
the specialists of Information systems* 

3* General thesaurus, special 
thesaurus 
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The construction of a thasaurus requires a comparatively 
long time, and time is also the destiny of the thesaurus. Is 
a consequence of the acceleration of scientific development, 
the division of the individual disciplines and the semantic 
range of the tarns elU be transformed, will broaden out or 
will become narrower, but In certain cases thay may caase* 

The division of the content of terms causes most problems, 
often those most difficult to solve* A scientific problem 
whlcn was expressad during a certain period by e single des- 
criptor may function in the future as the content of several 
descriptors* 

This process makes for s rich system of cross-ref erenoes s 
however this richness later becomes pollulatlon, which makes 
the system at first cumhsrsoma, sod finally nsalass* 

As s consequence of tbs abovementlonad process, thesauri 
must be rearranged periodle^ly* .A thesaurus can be expanded 
dizinga relatively long Intervali this possibility has,hofwever, 
a daereaalng occurrence In time* The possibility of e^analon 
is Inversely proportional to the growth of a thesaurus* 

In general, thesauri have a retrospective ebaraeter* She 
larger the thesaurus is, the greater is the validity of this 
statement* Computers may result In a significant tiM-eavli^ 
In the compilation of very large tbaanuxli there is, however, 
much to do whloh is not aechanlMble on oonpaterlsable, and 
theae taa3EB require long nad skilled labour ao that the ^la- 
sauiaa will ha retrospaetlve aqyway* 
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The lerger the volume ot a chesaurus, tbe harder the prolK 
em of ite eppUcatlon to modem and analytical information 
«ede, ort.i« to the inevitahly poor representation of the 
merKlne na» scientific fields of our days. 

It is iaiportant to make efforts to hulld thesauri for the 
.ndividual hranches of science. The huilding of relatively small 
jheaauri is always more economical than that of larger ones. 
!his is why I feel it necessary bo interpret the basic princi- 
ples of thesauri huildins according to the size of *Ufferent 
:^es 03 ? thesauruB* 
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HEKOTOPtlB nPOBJlEUtl KOHUEUllJjWH TE3AypyCA 
Some General Problems Concerning Compilation of Thesauri 
VitSaslav Maixner^ 

BP6A0HB6 

OIHHH H3 OCHOBHHX T0OpeTM0CKHX nO3H0HHtt HH$OpnaTHKH 30 HO- 
M0AHH0 row HBJIH0TCH, (503ycm)SHO, $8KT, iITO T030ypyCH OJI0W0* 
paCCH0Tp«BaTL K0K COCT0BHyi) H8CSB nOHCKOBHX H8UK0B, npHWH B0^ 
M ynyoKaTB h 3 Busy 3H0 «i0hh 0 /po»/ oooTBBTCTByDB0ro rpaiuiaiOT^ 
cKoro LcawH. n^TOHy (5B30 HanpaBBBBHHH aaHHHaTBOH npofi^a- 
THKO0 nOCTpO0HHH T 03 aypyOOB, H0 BUBCHHB H0KOTOPHX 33I8HaHTap 
nOHHTHtl «5 TBnoaoniB nOHCKOBHX H3HK0B. 

OwaKo TBnoBorHH bobckobhx hbhkob h0 BBaHeicH raaBHott t0- 
Hoe BOKJiaaa. noaioHy nu orpaHBBHUCB ibu, vto npBOMM HBCKOjrim 
paOOBBX OnpoaaJIBHKtl B3 3T0B OfiJiaCTB, H0 npBTBHWB Ha TOTIaOCTB, 

B aaiBB nopaflaBB k ochobhoB npofiaoBaiBKa. 

CBBTaKCHB0OKO-C0l8aHTHB0CKB0 

OTHOQIOHHfl 

HOR OBHiaKCBCOB BOHCKoro fl3HKB BH 3 MCB BoapaByBeBaeB 
6VB CBCTBBy fiOpBajHiHHX B aeKCBBeCKB-iOpBBJttHHX CpaaCtB, B0t«O 
ySaHOBBOBHy® B ofinanpBHBTyi) b oimcaHHB aatiHoro 
kb m, nocKOWKy aia cBcieBa wei bo3BO*hoctb oOoaHBBaTB ce 
HaHTHB0OKB0 /CBHCBOBBV OTHOBBHBB B01Sy 

Mbb KOBnaeKCHoe TeBaTB^ecKoa oinioaHBe _ 

Horrnfl 0 OBHiaKCBCOB cynooTByei hbbhho owa laBaH CBHieKCBBeo 

X Central Office of Soiantifio, Taohnioal and Boonomio Infoima- 
tioUp Prague, 
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Kan CHCTeiia, nouoranQaH onpeAeJuiTL uHozecTBO npasHiiBHUX Koun* 
jieKcuux TeuaTH^ecKBX oimcamitt na ziaHHOu HS&iKe. 

i^opiiajiLHuiiH CHHiaKCH^ecKH-ceuaHiH^ecKHUis cpeiicTBaiai IIH hb- 
JIHUTCH, UanpHtiep, CK06KBt COeAKHHTeJIBUUe 3H8KH, paSJm^HUe 3H8- 
KH npenHHauHH, ceuaHTH^ecKH pcJieBaHTH&iii nopHAOK AecKpunTopoB 
H T.n. 

JICKCH^eCKM-^OpiiaJIBinillH CUHTaKCHiieCKH-CeiiaHTHHeCKHUH CpeACT- 
BaUB IIH . SHIOTCH, B ^aCTHOCTB, CneiBiaJIBHUe AeCKpHnTOpu, COOT- 
BOTCTBeubao o6o3ua^eHHue b Tesaypyce* BeKCH^ecKoe onpeAeJieHiie 
KOTOpUX AOnOJlUeHO HeTKHBlI npaBHJiaMHi HX CHETaKOH^eCXOfi $yBK- 
QBH. 

JleKCHMecKBe CHHTaKCHEecKH-*ceiiauTHEecKHe cpescTBa He$opMajiB- 
Horo THna, to uejiecoo(5pa3HO Bu6paHHue Him cjiy^iaftHue /kom- 
dHuauHH AecKpBnTopoB AaHuoro cjiy^BH/ uoryT HapeAKa nojiuo- 
CTBK) Hjm nOMTB 0AH03H8EHO OnpOACAHTB OTHOmeHHH UeXAy- ACCKpinTO- 
paiui B KaKou-UH(5yAB KOUKpeTuou KounjieKCHou onHcaHHH AOKyueBTa. 
Oahbko b a&bhou cAynae He BcerAa ucxho fobophtb o cnBTBKCHce 
/3TO ue HBAHCTCH AOCTBTO'IHiai yCJIOBHeil/* ' 

HcnoJiBaoBaEne ceiiauTHKH b IIH tfea CHBTBKCHca, kbk kocb6h- 
HO cjieAyeT h 3 npeAUAyousx npu<5Jiu3HTeBBHux H paccyx- 

AOHMlt, HanpaBJieHo, np^HiiymecTseBBo, Ba CHCTeuBO-nexcBBecKyio /le- 
aaypyc/ h npHXBaAHyi) /HBAeKCHpoBaHBe« $opuyjiKpoBKa noHCKOBUx aa- 
npocoB/ odxacTH* 

npH nepexoAe k IIH c CHHTaKCHCOu ecTecTBeBBafl 
npo6jieiia oTBocHTexiBott oxBoaBax- 
BOOTH onBcaBHfl AOKyiie H T a, nonaxaeT xa- 
CTHHBO B nJIOCKOCTB CHBTaKCHBeCXO-CeuaBTHBeCKHX OTBOOeBBtl /b 0 ( 5 - 
misx Bepiax cpaBUHuolt c cHHraKCHcoii npaxBoxeHHB h ceuaBTHKotl npex-- 
BOXeEHfl eCTeCTBOBBUX H3UK0B/. 

Hpo<5BeuaTHKa CHCTeHBo-BexoK^ecxoro xapajciepa /coaxaBne Teaa- 
ypyca h ero nepecHOTp/, b ocbobboii, npacyma bobcrobuh BauKaii c 
CHBT8KCHC0U H (5d8 CBBTBKOKCa* OABBRO Tp6(50BaHHfl R p^BCMy T63^- 
pyca /8T0 KacaeTCfl, b nepByn oxepex&f KOimBecTBa xocRpnnTopoB 
H oopexdBeBBHx sexcHBecRBX coBeTaHHit/ OMoryr (5 utb, b peayxBTa- 
T6 BBexpesHti oexecoodpaaBoro CHBTBRCHca, aBaweABBo coKpaaeBU. 

B BeKOTopou cMucBe uoxBo cRaaaTB, bio b cpaBBaimx no Kane- 
CTsy cxyBaflz npBiieHeHBfl /3$$eRiKBH0( tb noHCxoBOtt xeBiexBBocTH/ 
$yBKmiD leaaypyca nepeEHuaei otbbcti CHBiaRCHc* CHBiaxcHc^ jih- 
oeaButt laROll xoimeHoanHOHaott (yBROBH, bbubcb BeH36exBo Bea$- 
$eKTXBHUii ycBoxoBHeH noHOKoro flauica* 
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AHa;iorH^oe BsaiiuHOKOuneHciipyiomee OTBomemie mmo upesno- 
BOXHT& uex^y ^yHKCHflUM $opuaBi>HUx h JseKCiiKO-$opuaji£Hux cpe^^cis 
B CHETEKCBMeCKlAX CHCTCUaX. 

B HECTOflmee speufl cyiuecTsyeT yse fiojiBinoe KOjaiMeciBO noHC- 

KOBUX H3UK0B C CEHTEKCHCOM, KOTOpUE OTJDlHEX)!rCfl APyr 01 APyTE 
XOTfl 6U B ^pUEABHOll OTHOOieHUI!. B AEJIAX HEQierO OPHEHIKpOBOiIHO- 
ro 0630 pa BB 6 A 6 M CHEMEJia CAeAyK)niy)l) OCSmyD KAECCH^HKEUHK), buabe- 
vamyio ceMEHTHAecKEe ECneKTU chhtekchce: 

1/ nfl Bupaxaioiiuie jmuiB CTeneHB bsehmocbash ue^Ay AecKpun- 
TOpaUA B KOUnABKCHOU OnACEHBH AOKyUEHTE 

2/ IIH BUpasaiOmHe AAQIB ^yHKUHD /pOAb/ OTAOJIBHUX AeOKpAHTO- 
POB B KOUnAeKCHOM OnACEHAH AOKyUCHTE 

3/ CH Bupasaioipe a cTeneHB bsehmocbash, a poab AecKpAHTO- 

pOB 

4/ nH BupaxEioipe bsamocbasb uexAy AecKpAATopaMH b KOMnxeic- 
CHOU OHACEHAA A0KyM6HTE 60A6e ABHO, A6U Ilfl, yKESEHHUe B 
nyHKTEX 1-3. 

ECJIA Mil, HE 06 opOT, nOAMepKH6M ^OpMEABHUe ECneKTU CAHTEK- 
CACE, no CBoeuy xapEKiepy Gonee a.tia uenee BTopocTeneHHue« to 
AOXHO PE3JIAWTB CJieAyDAAe BPO p6AeBEHTHUe 3BEXA C npASHEnHUUA 
AJIA OeCpASHEqUUHH BEpAEHTEJIA: 

e/ AeKCA^lHOCTB 

6/ n03AnA0HHEA peAeBEHTHOCXI) 

b/ $OpUEJI&HO-CAMBOJlAMeCKOe 0603HE^6iiAe OTHOmeHAfl. 

Ecaa uu npeAnoAoxAU nojroyc coAeTEeuocTB npASHEtiHux a 6ec- 
npASHE^KUX BEPAEHTOB e/, A b/ .A yHT61l KOUnAeKCHUU OCpeSOM 
ceMEHTAnecKHe a $opuEABHue EcneKTUy TO m nevKo npAAev k sa- 
KJDOMeHAIO, ATO AAA BUAeyiraSEHHOit OpACHTApOB^^HOt) KJ1ECCA$AKEUAA 
UOXHO TeopeTA^ieCKH OGpaSOBETB 32 paSAA^IBUe TAnOBUe KOVdHHEIUlA 
nfl C CAHTEKCACOM. K0U6AHEUAA AdUnO^IATeABHO (5eCnpA3HEqBUX BE- 
PAEHTOB a/, (5/ A b/ BPAAy AEBHoro onpeAeneHHA nfl c cahtekcacom 
O eCCIOICAeHHE, a nOSTCUy VEKCAUeABHOe VICAO TAHOBIiIX KOUdAHEUAtt 
nfl CA6AyGT COUPETATB AO 28. Ho XOAA^eCTBO OCyAeCTBACHHUX TAHO- 
BUX KOUdAHEUAfiy HO BCOti £6pOAT .OCTAf HEXOAHTCA HOA 3T0ii TEOpe- 
TAqecKoii rpaHAaEfl. 
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CeMaHTuqecKHe npoCjienH 
cocTasJieHHH TesaypycoB 

JlMTepaiypa, 3aHHUED!naHCH MeTo;iHKoH cociaBJieHHH TeaaypycoBi 
cpaBHHTejiBHO aocTynaa. Cnemi$H'iHOCTi> iiohhthh ’*T63aypyc” paa- 
TTwo HMim aBTOpaMK TOJiKyeTCH no-pa3HOjyfy * B aasieu flOKJiafle mh Hueeu 
B BHfly joimh T6 TeaaypycM, iiecKpHniopH KOTopux HaxoflHTCH b npn- 
MOtt ^OpMaJIBHOft CBH3H C JieKCH'ieCKHlO! SflHHHUaBIM KaKOTO-JIHCSO eCTe- 
CTSeHHOrO H3UKa. 

OOmeHSBecTHO, ^TO aa npaKTMKe cccTaBJieHae leaaypycoB hbjih- 
3TCH OHeHB Tpy^oeMKOSi paCoTOt^:, KOTopyio UOKHO cxewaTHqecKH pa3- 
^ejiKTB K8 TpK 3Tana: 

1/ noaOop MCTOMHKKOB 

2/ cocTaBJieHHe lesaypyca 

3/ HcnHTaHHe a nepecaoTp. 

B cjieAyaniHX aaMe'jauHflx mu nonuTseiiCH hbjiosktb b odniHX,Teo- 
peTX'jecKKx 'lapiax aiy ueT02iWKyi c TomcH 3peHHfl ceiiaHTHKH IIHi 

K n. 1/ CoOpaHHe TepwHHOJiorK^ecKKX waTepiaaBOB, aa KOTopoa 
3 KOKKDeTKUX CJiy'iaHX d 83 HpyiOTCH| HBBHeTCHy IIpHHHUaa BO BHHMa** 
HH6 unpoByK) OHdjworpaciKio TBpuHHOJiorM^acKHX nydJiHKaiBiti no BaHHoR 
TeMaTvmecKoR oGjiccth, Bceraa HewaOeKHO cJiy^aftHUU h HenojiHUU. 

3Toro xcxo;iHoro uaTep AaJia, ecTecTBeHHO, o^eHB Biiami- 
TeJiBHo, xoi'fl w rsejiaTejiBHc , 'IToOu oho Chjio noBWHeHO e^HHoR koh- 
uenuHM nocTpoeHHH Teaaypyca. HawCojiae CjiaronpwHTeH diui du» no- 
BHflWMOKy, TaKoR xofl padoT, np KOTOpoii 6u OTCop iiaTepiiajioB npo- 
B 03 HHCH B COOTBeTCTBHH C 3apaH6e pa3paC0TaHH0R «ieTKOR KOHUen- 
imeR. 

K a. 2/ Ha 3T0 m aiane OTcyrcTSue qeTXoR KOHuemmii c^e^ysT 
paccuaTpHsaiB xaK cepBesHUft HeAOCTeTOK. 

PaapadoTKB raicoto KOHneniWH npeAycuaTpHSacT Haanwe: 

1/ rpanMaTM«iecKoro oiiHcaHHH Ilfl, no KOTopoiiy moxeo onpeae- 
juiTB npvi 6 nj 43 HTe 3 iBHyB ‘siincBy© xaparaiepHCTHKy Ilfl h ocHosHue npa- 
BBJia HHAeKCMpOBaHMH K $OpuyJIMpOBICIl nOHCKOBUX 38npocoB; 

2/ OCHOBHUX CBeAeHHR 0 CnpaBOHHO-HH$OpUalWOHHOll $0Hfle - oC 
ero oOlbeiie, cpeiuieii roaoBoii npwpocTe, TeuaoiH^ecKoR ctpyKTypOi 
0 npeodJiaBaDmeia iieioAe oCpadOTKH AOKyiseHTOB h noHCKOBUx sanpo- 
COB 
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a/ B cpeAHeu aa HecKOJiBxo nocneAHHX set 

6/ B nepcneKTHBHOM pasBHTMH Ha dJiHsaftome roAu. 

KoHKeniiHH nocTpoeHHfl Teaaypyca AOHSHa coAepsasB: 

A'/ OnpeAeJieHHe rwiasHUx ceuaHtunecKHX odxacTeii Teaaypyca, hhof* 

Aa TaKze noACSop rAasHux xeKCHqecKHX AecKpHHTopoB /kek npa- 
BMAO, JmmB OAMH AeCKpHHTOp A^H OnpeAeAeHHOl! OdASCTH/; 

5/ AHanaaoHU KOJZH'iecTBa AecxpunTopoB a^h Bcero Teaaypyca h aah 
OTA 6JIBHUX CeuaHTH^eCKHX ocSAacTeii; 

B/ BoauoHHoe onpeABAeioie, b cooTBeTCTBHn c thhou IlH, ochobhijx 
^O pUaABHO-AeKCBMeCXHX AtSCKpHnTOpOB; 

r/ 7cTaHOBAeHne Bcex thhob ccuaok, KOTopue dyAyr nocAeAOBaTeAB- 
HO npMxeHHTBCfl B Teaaypyce; 

]l/ IIpH(5jiH3HTeABHoe onpeA6AeHiie odBeua, n xanoy dyAyT npHBOAHTBCH 
<|)pa3eoAorimecKHe co^eTaHHfl /copaauepHO c KOAH^ecTBOu AecxpHn- 
TOpOB/. 

TaxaH KOHuenmifl, nocKOABKy OHa npudAnanTeABHO cooTseTCTByeT 
npeAnocuARaM 1/ m 2/, yxaaaHnux Buoie, AOAXHa Cuah 6 u npoActaB- 
AflTB B nOAOXXTeABHOU CUHCAe OrpaUUqHBaiOQUfti nOAdOp $aKTOpOB, Od* 
AerqauQiHfi npaxTu^ecKyD padOTy HaA cocTaBseHueM Teaaypyca h cnr^ 
HaJlH3Hpy»BlHfi BOaUOXHtte OiaUdKR H O0UldO<lHHe TBHAeHmiH B X0A6 3T0lt 
padOTU. 

OAHaKO B HacTOAii;3e BpeuH od aAeKBaTHOCTB KOHuenuHH uoxEO cy- 
AHTB, ICaK UpaBHAO, TOJIBKO '‘ez pOBt", T.C. HOCAC aKCnepHl^eHTHAB-' 

Horo HcnuTaHMH Teaaypyca, too, deaycAOBBo, ho HSAfleTCH HAeaAB- 
HKy noAOzeHHeu. IlyiB, HaAexHO BeAymnl) ot npeAnocuAOK k aoctl- . 
TOTOO aAeKBaTHofi ROHaemiHB, AOAxen duA du duTB B iiaxcHyaABHoU 
pe noAroTOBACH uH^opMamioHHUKK TeopeTHKaiffl. Ho noRa ocxaeT- 
Cfl KOECTaTHpOBaXB ABAORO He pSAOCTHyU AeHCTBHTeABHOCTB, TOO 'i7>)* 

B atOK HanpaBAfiBRK ae dUAO b lUspoBoy uaqmTeOe rotor HH<iero 
npeAnpHEflTO. 

npHHe^aHRe I. ^menpHBeAeHHuR yeiOAR^ocKHl^: npRH- 
mm uoxHo BRpame BupasHTB Tax: rpavuaTRTOOKoe onRcaHMe HH r 
xapaxTepRCTHKa CH^a npRdARaHTeABHo onpeAeAflcr KOBuenm^a) Teaay- 
pyca. B doAee odmefe n.iocKocTK moxho duAo dH bto HBTepnpeTHpoBBTB ksk 
HOKOT opoe aaBOAORo RcnoABaoBaHie BaaBMootHooieHSifi vezAy rpauuatR- 
TOCRRy onHcaHReM IIH, xapaRrepHCTHKofl r cTpyKiypol^ Teaaypyca. 

C $OpMaABHO TeopeTRTOCKOft CTOPOHU, T.6. 6CAU He npHHMUaTB BO BHH- 
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siaHne ueTOAH^ecRMe npo6;ieuu Ka npaKTHKe, uosho paccuaipuaaTL 
juo5oti sJieueHT aiofi TpotiaH s sasHCnuocTH ot ocTajn>HUx Asyx 3Jie- 
ueHTOBf qro uoxHO cxeuaTH^ecKii saiwcaTB Tax: 

1/ nfl f/y. T/ 

2/ P = g/nfl.T/ 

3/ T = h/nfl,P/- 

^ noKa HSHO paccuaTpHsaJiH tojibko OTHOiueaiie 3/« bo uu Be yr- 
BepjcaaeM, mo OTHomenJiH 1/ h 2/ Bcarjia TepHBT aa npaicrHKe SHa^e- 
HBe. 

[\3 OT2ieJiBHBx nyHKTOB BbioienpHBeAeHHoK cxeuu KOHiienuBB cociaB- 
JI6HBH Teaaypyca BiaecTe c npeAnocujiRauB o IIH n BUTeicaeT; 

Bs A ocHOBHan opueRTaiiBH ceuaHTHqecKOtt cvpyKxypu xesaypyca 
B3 £ cTeneHB coKpa^euBH cHHOHBUBqecKBX KpyroB BCXOXHoro ecrecr- 
BeHHoro B3UKa 

B3 B peAyKUBOHHaH odpaTHan cbhsb co Bceioi rjiaBHUioi ceuaBTH^ecKH- 
lui odJiacTfum xesaypyca /ncTeHUKOHaJiBHO oCycxaBXBBaioiiiaH 6o- 
Jiee BucoKyio cpeABiOic cbocoOhoctb KOUdBHaoBli AecKpunTopoB/ 

B3 r peAyKUBOHHaH ocipaTHan cbhsb co Bceun rjiaBHU&iK ceuaHTB^ec*- 
KBBM oCJiacTHMB Tesaypyca /noTeHOuoHaJiBHO oCycxaEjnsBajonnafl y- 
TO^HCHHe MecTHOti ceuaHTH^ecKoB CTpyKTypB Tesaypyca/ 

B3 JX OOpaTHan C.BHSB co CnOCOdHOCTBD KOUdBHaUBB AeCKpBHTOpOB H 
yToqHaHKe aecTHOft ceuaHiK^ecKOft CTpyKiypH. 

npHueqaHHe 2. B CByqafix B, Fa fl peflyituHOHHafl 
odpaTHan cshsb ixosex upohbjihtbch, b qacTHOCTB« b TeHBeBUBK k 
yBieHBmeKHS} KOJiaqecTBa AecxpanTopoB /a/iAsm b TeEASHOBa r ynpo- 
mdHiiX) onpeAeJieHBii AecxpanTopoB. 

npBueqaHBe 3. yToqaeHBe MecTHOti ceuaHTB^ecKoli 
CTpyKTypu B cjiy^aRx F a A uozeT npoHBBHTBCH« b qacTHOCTB« b ao- 
noAHeHaa oiipeAeAeHaH aAa HedoABooiy. ceMaBTmecKH cBaaaHHUX rppn 
c nouomBi) oTHOuieHat) caHOHaiomHOCTa* aepapxaqaocTB BAa KOHTeKCTa. 

nonp8BKH« npoBOAaicue b Tesaypyce^ iforyr HaxoABTBCH b pay- 
Kax aapaHee paspaOoTaHHOi! KOHUeniQia aAB« b RpattHeM CAyqae«npeA*- 
CT8BAHTB cymecTBeBHoe aaueHeHBe nepBonaqaABHott ROHqenBHH /noc- 
KOABRy OHa BOr OfllAa HCHO C$OpuyABpOBaHa/. 

He npexei yn Ha noAHOTy« cacTeMaTB^HOCTB h eA^:H006paaHe RpH- 
TepaeB uozHO pasAB^aiB cA6AyoiQBe tbbb aaMeHeHBtt: 

1/ o0BeABHeBH6« pa3A6AeBae« BCKAD^eBae bab AonoAHeHMe b ca- 
CTeue OCHOBHBX ceMauTBBecKBX odAacTeft 
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2/ M3ueKeHHH B Hep&pxHueCKoii ci’pyKType sayTpH KaKoft-jmOo 
ocHOBHOti cewaHTimecKOi^ o(5jiacTH /(Sea KaueneHiiM b coot-^ 
BeTCTywnieM HeynopaaoueHHOU MHOxecTse aecKpKnTopoB/ 

3/ BLeaeHiie mm KCKJuoMeHne AeoKpHnTopoB c nocJieayxinHii usue- 
HeHMeM nepepXKMeCKOii cipyKTypK. 

SKcnepHueHTajiiHoe Hcnh’TaHne ;J)yHKaHii Teaaypyca, cnopajOH^ec- 
KM6 HcnpaBjfieHMH, AonojriHeHHH h aanjriaKHpcBaHHHe CHCTeiiaTH^ecKHe 
nepecMOTpy Bcero leaaypyna HJm ceMaHTii^ecKHX o(SJiacTefi cjiyxaT 
oahoH iiexy: vto(5u ceM8HTimecKaa cipyKiypa Jiyme cooTSeTCTBOBa- 
jia xapaKTepviCT'Ute 4)OH;ia b ero paSBHTHH k AaHhouy rpaimaiMMec- 
Kouy onKcaHir-^’ Hfl. Cahoko pojriL Teaaypyca He 4oxxHa Gutb b stoic 
cwucjie Bcerrvj: tojibko xh::l naccHBHo npiicnocoCjiHiomeWcfl. CTaOmm- 
SHposaHHan v^m oTHocuTejuao ycTotl^BaH ctpyKtypa Teaaypyca oxa** 
auBaex cymecxBeHHoe BJOfRHHe na HeKoxopue napauexpu CH$-a, b ?a- 
cxHocxu, Ha ero pasAejieHHO no paajmtcHiai luiaccaii qacxoxHooxH.Uo- 
XHO sase npeABHAOXB xaKott CJiy^atty xorsa onxmiajmsHpoBaKBaR ceMaH«> 
XH^ecKaa cxpyKxypa xeaaypyca BMOCxe c xapaKxepHCXRKOtl CH^a dy- 
Ayx noscxaauBaxB HeodxoARMOcxB HsiceHeHiiR rpaiuiaxH^ecKoro omica- 
HHR IIH RBH BUdopa SPyTOPO Ilfl /cy9eCXB6HH06 B3ICeH6H2e U0X6X dUXB 
paBHOCHJZBHO Budopy oxjDiworo no XHny IIH/. 

SaHjiD^eHRe 

HacxofljQHH soKJxaA cxaBkiJi CBoeft aejxBD BKpaxae odpucoBaxB ho- 
Koxopue ocBOBHue npodneiai nocxpoeHHH xeaaypycoB c odmefi xotoi 
apeHRH. Hcho, iniorue hs axBX npodnoM aacnyxRBaim du donee 
sexaJCbHoro cnemsauBHoro Hsy^eiuiR. Iloxa b Haneic pacnopRxeHHH Re 
iiHoro xaioix padox, h ocxaexcR BupaaaxB HaneiAyt tixo xeopexxKR 
no RH$0pMaXHK6 yXOARTB Rll B dBHXafiinUie roxu dORBOe BHHICaHRR. 

JlHxepaxypa:. 

A. 61, ^epHiill: Odman icexoARKa nocxpocHBR xeaaypycoB, UTII Ks 3, 

1968 r, 

O.Saoheori M,E8nigov&: SeleJcSni jasyJc a jaho popla, 

Praha 1968, ^ 

D.Soargal: KlaBslflJcatlonaayatama und Xheaaurl, PrankXurt/U , 

1969. 
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TesaypycH, aaxojziHimaecfl b npnuott $opuaja>HoK cbash c JieKca- 
qecKMUH 6AUHHuajw KaKoro-jiHdo ecTecTseEHoro eshke, paccuaipE- 

BaSTCfl KEK ^aCTBCeJieKJ^tOHHHX H3UK0B, THnOJIOrUfl KOTOpUX KpaTKO 
HaMO^aa* B odmei) ejiockocth y^TeaH oTHomeaafl uexAy rpaMaaTMMec- 
Km onacaafteH noacKoro aauKa, CTpyarypofl cnpaBo^ao-aa^paaimoa- 
Horo $oa2ia a CTpyaxypoft Teaaypyca* npHseseau aeKOTopue ocaoBHue 
aaeueHTu KOHuenaaa axh nocTpoeaaa TeaaypycoB a yKaaaao BjiaaHae 
3 toK KOHuenuaa aa CTpyKiypy Teaaypyca. 

S'Ome General Problems Conoerning 
Compilation of Thesauri 



Thesauri, based on a direct foxmal correlation with lexical 
units of a certain natural language, are conceived as integral 
parte of retrieval languages, the typology of which is briefly 
presented. Special attention is devoted to some general aspects 
of the relations to be found between a retrieval language gram- 
mar , file structure and structure of the thesaurus • Some prin- 
ciples directed towards a controlled compilation procedure are 
proposed. 
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STATISTICAL AKALTSIS OF lX)CUllBirrATION FILB3 - 8ADF 



Josef HojSlsekF 



Statistical analysis of a documentation file is carried out 
by two apecial computer programmes* 

The firs t is used to test the distribution of do* 

cuments in the file as well as exploitation of the file in re- 
trieval requesta, the aim being to modify the classification 
system and thus also the file structure in such a way« that it 
may become more suitable for manual storage and retrieval. 

The print-outs of the aecond program /they are 3 in numbed 
provide much lucid infork tion on the documentation file or on 
its manageable aamplet which is used as input data. 

The procesaing of documents conaists in assigning one or 
more simple indexing terms /further SIT/ to each document. The 
group of bit's is called retrieval docnent description /fur- 
ther SDB/ - FIG* 1 A. 

The whole WD punched on a tape serves as input data for 
each document entering the analysis. The computing system is 
so arranged as to allow simultaneous processing of up to 
10»CXX) documents. - FIG. 1 B» 

As the result of data proosssing we get 3 output print- 
outs and some other information. 

The first nrint-out is called a "fabls of simple indexing 
terms" and it is used as an auxiliary tabls to ascertain an 
uhlcnown rank for a known SIT. - FIG. 2. 

The aecond nrint^ont js called "Survey of SIT functio- 
ning". It Is divided into blocks i each block being reserved to 
one SIT. Columns 1-10 involve only the given SIT and columns 
11-14 show the relation of the given SIT to tbs so-called 

^ UVTBIi Prague. 
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concurrent SIT^Sy wblcU are In combination with It. - FIGS. 3 
and 4. 

In thlB second prlnt-out there is the concluding Informe- 
tlon: 

* average number of BBD s 

- average number of concurrent Sir's - FIG. 3« 

The last - the third Prlnt-out - "EDD Frequency list" - 
characterizes Individual BJDB's. It consists of pairs of linos t 

- In the upper line there Is the RDB written In full -haze 
the BDB Is composed of four SIT'^s 

- In the lower line there are 12 columns to be distin- 
guished t See FIG* 6. 

The last five columns 8-12 contain ranks of Sir's written 
In the upper llnsf which participate In the given BBB. 

This 3rd prlnt-out gives the following Information^, 

. average number of documents Indexed by one BBB 

- average |>ercentage of documents Indexed by one EBB 

. and average Increment of entropy - not yet evaluated - 
FIG. 7. 

A special print-out gives the followliig vary Important 
data: 

- number of occurrences of all Sir's 

- number of all processed documents 

. number of SIT types - FIG* B* 



Eve. uation of Output Prlnt-outs 

The output print-outs offer us rlchf well-arranged infor- 
mation on the document filet or more exactly on the processed 
sample. They present an objective picture of the structure of 
the file. The data obtained can be used in many wayst sooh as 
for the revision and Improvement of the indexing systemi for 
the objective ascertainment of principal and peripheral prob- 
lems of the flle« for the compilation of a thesaurus and so 
on* 

The relation betwaen rank aid frequancy or aceumolative 
frequency la given bf the table In FIG* lit which is in fact 
an extract from the "RBB Frequency list" ** /3rd prlnt-out - 
FIG* e/« 
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Graphical representation of this table is In ?IG. 9« 

The axis of ranlcs is expressed on a logarlttanlc scale /see 
Interval 1-1705/ so that se can study Individual sections of the 
curve. 

The real course of the curves can be aeen in PIG* lOfVhere 
both scales are linear* 

When determining the relation of the length of BDD^s to 
their frequency /i«e* to the sise of equivalence classes/ «e 
shall York with a nev notion of an ieo-frequency group* 

In order to detemlne this relation the table in FIG* 12 was 
conpUedt 

Sach iso-frequency group is identified by two characteris- 
tics t 

- by the BDD frequency of this group - for instance 24 
/framed/ 

- and by the interval of the BBD ranks belonging to this 
is»-frequency group* 

The purpose of the method is the analysis of HDD's by their 
lengthi i«e* we try to ascertain how many Sir's compose HDD's 
belonging to individual iso-frequency groups* 

The diagram in FIG* 13 graphically presents the left part 
of the table in FIG* 12* Sach iso-frequency group is dlvidedv 
in a linear way« into subgroups by different HDD lengths, l*e« 
by the number of Sir's Involved In the HDD* 

The diagram Indicates how many iso-frequency groups exist, 
the number of HDD s included in each group, and frequencies of 
individual HDD's as well as their distribution by length in 
aach iso-frequency group# 

The values in the right-hand part of the table in FIG* 12 
are the basis for plotting the diagram in FIG* 15# which is a 
graphical repreaentation of analysis regarding HDD frequencies* 
This diagram demonstrates the relative parts of the fi?e inclu- 
ded in separate iso-frequenpy groups as well as the distribu- 
tion of these groups by the length of HDD's* It can be used as 
a ncBogram* It presents, also, h<w the file is distributed 
into HDD's by their length. 

The analysis, the result of whioh indicates the combina- 
tion pone* of indexing terms, starts from blocks for individual 
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Sir's we process columoa 8 to of the second pxlnt-out 
"Survey of SIT Functioning"* — FIG. 

Then in the third print-out we gradually find HDD's with 
rahics given in front of horizontal lines - examining columns 8 
to 12 and marking by circles those Sir's that are participa- 
ting in particular BDD's. 

As an example the following block of SIT with rank 73 - 

PIG* 21 - was prooeased. 

A graphical display is to be found in diagrams in PIGS *22 
to 24* 

These investigations will allow us to learn a great deal 
about related terms or descriptors /Sir's/. 

It is probable that the needs of practice would lead to 
the alternation of programme to the end that it might offer 
the maximum of useful information in the most convenient form* 
The proposed methods of the second programme regard only the 
analysis of a document file, although we know that the program- 
me could also be used for the evaluation of the set of retrie- 
val requests* 



!TOrs 

All Figures see in the Annexes 



COUPUAIION OF THESAUBI FOB UB£ IN GOMFDTKR SISIIEMS 
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A thesaiirus can he defined as a structured vocabulazy for 

use In Information storage and retrieval systems. 

!rhr6e parts of this definition need further elaboration: 

1. A vocabulary is a collection of terms. 

2. The structure of a vocabulary can be described as a set of 
relationships between terms. 

5* Utilisation of a thesaurus In an Information system lovol- 
vos a set of rules which take Into account the characteris- 
tics of the system. 



X 

1. There axe three types of thesauri according to the type 
of terms they consist of. 

A few of the earlier thesauri consisted solely of uolteiW^'^f 
l.e. single words. Some of these ecqxxAred. significance only In 
combinations* Sven sixq^le concepts had to be represented by a 
Cixiblnatlon of unltexms. 

Many thesauri are of the ^unlconcept** type. Uniconcepts can 
be either unltexms or polytoxms, l.e. single words or ccmblni - 
tlona representing simple concepts. 

Frequently oo-ooourrlng uniconcepts and unltexms can be 
combined to form pre^coordlnated terms of the **subject beading** 
type. Most of today’s thesauri comprlae both uniconcepts and 
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phrases representing a pre-coordination of two or more conce^Sy 
which could he called "poly concepts*'. 

It is generally accepted that the vocabulary of a thesaurus 
should be homogeneous- However • there are a few exceptions. 
thesauri of the UBAjjX 3^^^ and MEDLARS^^^ifor example, which are 
generally of the "subject heading” type, comprise a limited 
number of uniterms /called "subheadings”/ to be used only as role 
indicators as it were in combination with the main headings. 

X 

2. There are three classes of relationships between tbe temis 
of a thesaurus. 

The "optional" or "iniicative” type Is represented by the 
crossreference A see also B . which invites the indexer in 
the process of assigning term A to see whether term B is not 
also relevant. 

Also of the optional type are the HT /related tern/, 
/narrower tern/ and BT /broader term/ relators now in use in 
most of the Bnglisb-language thesauri. The HT relator invites 
the indexer to check whether he should not be more specific. The 
BT relator suggests that the concept to be indexed might have a 
wider coverage than the main entry consulted. 

The "compulsory" type is represented by the reference A \ise 
B. In many thesauri this means that term A must not be used 
and term B assigned instead /"exclusive" reference/. Term. B is 
then either a preferred synonym or an abbreviatiout or a spelled 
out version of term A. But it could also mean that term B shoiud 
be assigned in addition to tern A /"complementary" reference/. 
5uch "complementary" references are mostly "generic" references 
from a specific tem to a term of a higher hlerarohlca.l level, 
and the additional assigtment of term B is called generic t>oa- 
ting . 

In a theaauruB where exclusive and complementary references 
co-occur, it is ne tessary to tag them with different symbols or 
relators. 

A use B aust be distinct froa A use also B or A add B 

A third claaa of reXaton is necessary to represent the 
"alternative" relationships which exist between the various mea- 
nings of hofflographlc terns. 
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Homographic terms cannob be allowed In a retrieval system 
if selection of non-re levant Information Is to be avoided* 

The references from a homographic term to th^ alternatives 
offered must therefore be of the ” exclusive” type. 

A see B or C leaves the Indexer free to ch' ose which 
term he will assign to replace term A* 

In a number of thesauri the difficulty Is overcome by ad- 
ding more or less elaborate specifications to homographic terms: 
BHPHODUCTION /BIOLOGT/ Is unmistakably differentiated from 
PRODUCTION /COPTiro//'^/. 

One of the disadvantages of this method is that the Inver- 
ted form of composite words has to be maintained in the alpha- 
betic list if the alternative terms are to be found. CONBUCTI- 
»virr /BLBCTRIC/ and C0HDUCTIV2TT /THERMAV can be found by an 
indexer looking for COIOIJCTIVITI, but BUBICTRIC CONDUCTIVlTr and 
THERMAL CONDUCTIVITY cannot* 

Another solution consists In Ignoring all but one of the 
meanings of a homographic term: FRONTS /MBT9QROLOGY/ rules out 
all other meanings of the word FfiOm and other terms such as 
FOREHEADt FRONTAGE, and BATTLB-LIllfi must be resorted to to ex- 
press these concepts* In some thesauri the Indications limi- 
ting the use of a number of terms to one of their meanings are 
axtended In the form of lengthy scope notes. 



Structured vocabularies can be used In 
e/ conventional information systems 
b/ conputar-asslsted systems and 
c/ fully automatic t Interactive syatema* 
la conventional ayataas the document file can be arranged 
by tbeaauruB terms; there must ba aa many document copies In 
tha file aa tbaaaurua toma aaalgnad to the document* 

If the document eollectlon im arranged by author name, for- 
mat or In ehronologlcal ordert a separate index file le prepa- 
red which contaloa, la the order given by the theaa\irua /gene- 
rally alphahatlcal/y one ehmtamet or title eerd for each tha- 
aaurua term assigned to each doeumant. 

In a "dual-flla** ayatami tha ahatraet /or title/ cards axe 
in numarieal order /direct flle/t xhll^ a second fUe contelnn 
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one card for each thesaurus term with an iadication of the do- 
cument numbers to which the term was assigned /inverted file/ • 

The most pop^llar dual-file system is the "peek-a-boo card" 
file. 

Optional and alternative references can be noted on insert 
cards; compulsory references can be Implemented by introducing 
additional file cards or punching additional peek-holes cor- 
responding to the generic tenas. 

In computer-assisted and fully automatic systems, the the- 
saurus terms and structures as well as the files are stored on 
magnetic /or, exceptionally, photographic/ media* 

The use of the computer's sort and print facilities for 
storing, sorting, updating, and printing successive versions 
of a thesaurus in various presentations is very popular. Compu- 
ter-printed alphabetical lists, permuted term lists, inverted 
term lists, subject category lists, etc., can be found in many 
English-language thesauri. 

The storage of the generic and semantic relationships be- 
tween terms with a view to* their application to stored indexes 
is a relatively new technique. This will be discussed in Chap- 
ter II. 

The computer storage, processing and retrieval of direct 
and inverted files, again, is practised in all non-conventio- 
nal systems, v'lth widely varying results. These results, as 
shown in Chapter III, are largely conditioned by thesaurus 
structure and control. 




II - Computer manipulation 
of term relationships 

Preparing a retrieval file amounts to ccmpiling the maximum 
number of relevant thesaurus terms as access points for retrie- 
val by term coordination. The problem is one of ide;atifying 
those thesaurus terms which are explicitly or Implicitly con- 
tained in the tape-stored starting material. The starting mate- 
rial can be a corpus of titles / abstracts or full document 
text, or alternatively a set of manually assigned index terms. 




In the first casei the computer is fed strings of non-stan- 
dard, but generally correctly spelled words, and the "automatic 
indexing" routine will result in the selection of formally cor- 
rect terms, the relevance of which cannot be guaranteed. The 
routine will involve splitting the text into single words for 
the identification of thesaurus uniterma , and into groups of two 
or more words for the identification of composite thesaurua 
terms. The identification of terms is complicated by the presen- 
ce of nou*standard word forms involving prefixes and suffixes, 
and by the occasional insertion of non-significant words between 
term components. 

In the second case, the computer has to deal with generally 
standard, but often incorrectly spelled individual index terms, 
of guaranteed relevance, which can be directly matched with the 
thesaurua. 

In both cases the computer can then be used to perform "ge- 
neric posting", l.e. to carry out the information transfer sug- 
f «sted ty compulsory references of the A add B type. 

Note that B does not have to be an indexing term. In the 
Buratom system, for example, there are several hundred A add B 
references for the posting on to terms of s higher generic le- 
velf such as names of categories and disciplines, and cum\ilati- 
ve designations which it would be nonsenaica? to use for the 
indexing of documents, but which are quite selective in the re- 
trieval process. 

The posting on to generic and cumulative terms does away 
with the need to include long arrays of alternative terms in 
query formulations. 

Contrary to what might be expected, the expense occasioned 
by expanding the retrieval file through intensive generic pos- 
ting is lower than the additional cost of manipulating striogB 
of alternatives in the retrieval process. 

Generic posting can be ccmblned with a correction routine 
based on the A use B references. The correction routine 
amounts to replacing the incorrectly assigned term A by its 
preferred synonym B. 

In the Buret om system, this procediire was extended to en- 
compass the correction of spelling errors introduced by in- 
dexers and keypunch operators. The errors detected ty thesaozus 



- 68 - 

matching and corrected manually can be fed Into an ^error clc- 
tlonarjr" the format of which resembles that of the thesaurus. 
g use A means that averjr time the thesaurus tem A Is again 
misspelled S* It will be automatically' coirrected to A. 

Among the errors occurring for the first time, those pro- 
duced b 7 addition or omission of one char ac ter » substitution 
of one character for another | or Inversion of two adjacent cha- 
racters, can be corrected by a separate computer programme, 
which changes, for example, AICCX)L5, AIOQHOLES, AI1C0B0I8 and 
ALOCHOLS Into AXCQH016. 

Every error corrected by this programme can In turn be fed 
Into the "error dictionary" as an additional E ugjr A referen- 
oh. 

The correction routine using tana and error dictionaries 
can be developed, by Inclusion of all possible word forms and 
spellings and a list of non-slgnlf leant words, Into a programme 
for autoaatlc Indexing. Bxperlence has shown this approach to 
yield better perforsumces than the method using word stems and 
truncated terns^®^. 

However, the quality obtainable by automatic Indexing Is 
still largely Inferior to that of Indexing by spaclallsts, and 
it CCD only be coma attractive if this shortcomj.ng Is balanced 
by loeor coat. The major cost component being that for Input 
rather than computer prdceaalng, automatic indexing will spread 
as the availability on tape of the starting material, l.a. thr» 
texts of documents and abstracts, comes Into general use as a 
byproduct of pnhllcatlon»Optlcal character * recognition devlfies 
are not expected to become aconcmle to the point where automa- 
tic Indexing would be oosqpatltlve. 

The computer ia also used to ehsde the Internal eonslstan- 
cy of the thesaums etrueture. In particular, every tim& a new 
term or a new reference Is Introduced, a complex automatic rou- 
tine cheeks that the iutxodoctlon does not lead to duplica- 
tions, eontradletlona or continuous loops within the thesaurus 
or batwaan the thesaurus and tdia error dictionary* 
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Ill-Effect of generic posting 
on thesaurus building 

The number of A use B references In a thesaurus la ge- 
nerally kept as low as possible: It seems senseless to over- 
burden a tezsn list with terms that the Indexer is not permit- 
ted to use. The ina;)orlt 7 of the tezms are thus authorised 
for Indexing and retrieval. This makes It easy for the Indexer 
to find at least one term corresponding to tna concept he has 
In mlnd» but the retriever has to browse through a great many 
BTf KT and BT references In order to assemble all the terms 
relevant to the query, l.e. all the terms that the Indexer 
could have assigned to relevant Items. The use of automatic 
generic posting radically alters the picture. First, the Inde- 
xer can new limit himself to assigning the most specific terms 
corresponding to the concept to be Indexed, since every rele- 
vant generic term will be automatically added. Indexing depth 
will decrease, as also the Indexer’s workload per Item. On the 
other hand, the retriever will not have to bother about Inclu- 
sion of alternate terms, since every term be may select for 
query fcrmulatlon will bear the postings of all hierarchically 
subordinate terms. 

In the Sura tom System, where generic posting is being app- 
lied to a very large extent, the Influence on retrieval per- 
formance has been the following: 

Relevance ratio has significantly Increased, since the 
Indexers are free to follow their natural Inclination, which 
Is to put systematically more emphasis on specific Indexing. 

Recall ratio also Increased, ' since It Is no longer possible 
tor the i^etrlever to overlook relevant specific texms In the 
' uuery formulation. 



tary operations, they take less time and have become lees 
costly* 

The Euratom Thesaurus was c mplled as early as 1962 as a 
set of equivalent descriptors, 'ollowlng the example of 
^ 11 ^/ 8 /. 

A great many A use B references were then developed as 
result of the Indexing of more than 100,000 abstracts, and 



As both Indexing and query formulation have become elemen- 
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the Indexers were systematically Invited "to use tern B In 
preference to term A whenever pertinent"* 

When, In 1964, Suratom’s first computer was used to verify 
consistency of application of this rule. It hecame evident tbet 
the same Inexpensive operation could be used for correcting 
errors /S use A/ and for generic posting /A add B/*Throoghout 
the following years, A add B references from new specific 
terns /A/ to already existing descriptors /B/ were added to the 
thesaurus every time It was felt desirable* The distribution 
of the new references over the subject field was therefore not 
uniform* And the descriptor part of the thesaurus remained vir- 
tually unchanged, since maiy of the modifications which might 
have been considered desirable would have necessitated exten-* 
slve re-lndexing* 

The situation Is ^ulte different In the field of metallur- 

sy. 

The European Community is now considering extending Its 
Information dissemination activities Into the fields of metal- 
lurgy and, eventually, agriculture* 

Preparatory work is being done including an Inventoxy of 
the metallurgical literature and compilation of a metallurgi- 
cal thesaurus* 

This seems an excellent opportunity to apply the new meth- 
odology of thesaurus building* 

A number of basic rules and a plan of work were agreed 
^upon^and work on the thesaurus started In October 1966* 

The following is an outline of our plan of work and the 
way we went about It* 

Step Is Define the subject field to be covered and 
make an Inventory of existing thesauri and previous terminolo- 
gy studies* 

B e a u 1 ts Three Englisb-lacguage thesauri compiled by 
and a Prenoh’ thesaurus developed 

by I a number of excellent encyclopediaai tbs Metals 

Handbobki and the subjeot Indexes of the major abstraot jour- 
nals* 

In view df the number and quality of these documentm ! It 
was considered superfluous to oompile a representative set 
of terns by statistical analysis of a text corpus* 
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step 2t Eliminate duplicates from the term collection. 

Result: From a total of IO9CXX) terms originating from 
three thesauri, 1,100 duplicates and 9OO triplicates were dele- 
ted by alphabetical merging of the term lists, leaving a collec- 
tion of 8,000 terms. 

This step could have been combined with step 6; taken sepa- 
rately, It made steps 3 and 3 easier. 

Step 3^ Divide the subject field into coherent units 
containing 100 to 300 terms. 

Result: Creation of 37 term subsets covering various 
aspects of metallurgy, including materials, properties and pro- 
cesses. 

Step 4: In heavily loaded subsets, take special measu- 
res to standaiTdiee term selection and format. 

Result: Creation of a rule for the gezteration of de- 
signations of inorganic compounds, ores, alloys and isotopes, 
by combination of element names with the words CQ1!F0U2?DS, ORSS, 
ALLOTS, ISOTOraS, etc. 

This rule may well burden the printed thesaurus, but not 
the user’s memory. 

Step 3: Display the terms corresponding to each unit 
In semantic charts, grouping conceptually related terms around 
their preferred synoqyms. 

Result: 900 preferred terms, defined by their semantic 
context, earmarked for descriptor status, with an average of 8 
terms clustered around* 

Step 6: Define the reference type for each term and 
record the references for computer storage* 

Rea u 1 t: 3»800 descriptors, 1,000 A use B references, 
30 A see El or B2 references, and several thousands of A add B 
references, on 80->oolumn thesaurus fon sheets* 

Almost every term bad to be verified in one or more band* 
books and encyclopedias, decisions about homographic terms being 
the most tiae^consunlng, 

A number of texns were considered valueless in a thesaurus 
for use In a system based on term coordination, but they were 
retained, generally aa A use B references. In order to achieve 
complete convertibility between the basic term llsts^^^ /1Q//I1/ 
anl ths new thesaurus* 
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Step 7: Computer storage* consistency check and print- 
out. 



stored on magnetic tape, sorted alphabetically by first and by 
second terms* resulting in a printed thesaurus and an "inverted 
dictionary” with related terms grouped in synonym clusters. 

Duplicate entries* contradictory entries and references 
loops are eliminated in the process. 

Step 8: General revision and verification. 

Result: The display charts were checked against 'the 
” inverted dictionary” and given their final presentation. 

Term lists and display charts were critically examined by 
several experts; as a result of their comments a number of mo- 
difications were made. 

The total effort involved in the compilation of the metal- 
lurgical thesaurus by two members of the staff and a small 
number of experts participating in the pro;)ect was calculated 
at approximately 700 working hours, or four man/months. 

Production of the two listings required only a few minutes 
of computer time using the XBM-^60/40. 

The above figures seem quite incredible when compared* for 
example, with the tremendous effort that went into the 
or DoD^^^ Thesauri. Wallis f igures^^^’^^^^ represent a total 
of 70 man/months. 

It is not surprising that* in the light of such an enorm- 
ous expenditure of expert time /and money/ many studies . were 
started on the possibilities of automatic generation of the- 
sauri^^^^, which by the way, in turn, are costing a lot of 
computer time /and money/* 

Now that the efficiency of automatic posting has been con- 
clusively demonstrated, and with graphic display of semantic 
and hierarchic relationships making the compilation of generic 
references such an elementary operation, there is no doubt that 
more and more tl Bsauri will be built, economically and effi- 
ciently, u ' jg these methods* 

IV-Similarity factors 

Generic posting takes care of the BT references in indexing 
and their IKT counterparts in retrieval. The RT relators, however, 



Result: All the tezms keypunched on 80-column cards* 




- 75 - 

representing a variety of Ill-defined relatlonsnlps, bave esca- 
ped all attempts at computerizing so far. The documentallst In 
charge of Indexing or query formulation vlll assess, from the 
display of semantic relationships In the terminology charts, ^ 
which la the term most likely to represent a valuable alterna- 
tive to a given concept because of its similarity or closeness to 
It. If this "semantic similarity" between two concepts or terms 
could be given a numerical value, the computer could be taught 
to store and use semantic similarity factors for a number of 
/or all/ term couples In a thesaurus. 

The expression A/50/B might be used to mean that B Is 
semantically similar to A to the extent of 30 percent, and a do- 
cument tagged with term B might be expected to be half as rele- 
vant as a document tagged with term A. 

Various methods have been thought up, aiming at makltg the 
computer determine statistically the value of the similarity 
factors on the basis of an Indexed data base^"^^^. So far these 
experiments have not been conclusive. 

therefore consider making use of numbers assigned on 
the basis of expert judgment In the planned experiments Invol- 
ving similarity factors. 

The basic experiment, of which a number of variations are 
under consideration, can be described as follows: 

A question expressed by the customer as a combination of 
three aspects: ABC would normally be modified by the 
Information officer to Include a number of alternative terms 
/A2, B2, B3/ which he judges to be relevant In this particular 
case. The query thus becomea 

/A ♦ 12/ /B ♦ B2 ♦ B3/ C 

If I however, the computer knows the similarity factors of 
all term couples involving A, B, and C, It can be told to se- 
lect all terms having a similarity above a given cut-off value, 
say, 30 percent, and apply these as coefficients to the terms 
In the Boolean formula: 

/A ♦ 0,8 12/ /B 0,9 B2 ♦ 0,6 B3/ C 

A document tagged /I, B, C/ would be retrieved with a re- 
levance probability /the product of the similarity factors in- 
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volv«d/ of 1005w, • document tagged /A2, B, C/ eitb a probabllltj of 805^, 
a document Indexed A2, B2. C/ eith a probabilitj of 72?^. and a document 
indexed /A, 33, C/ eith a probability of 6or- 

The computer, using the similarity factors, thus not only retrieves 
ell the documents found by the human querist, but produces a ranking of 
the selected references according to their relevance probability. This 
relieves the system operator /or the final user/ of part of his burden and 
reduces the total cost of information retrieval. 

3 ini lari ty factors are expected to play a rola in the man-nachlne 
interactive information systems of the future. 
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AWB WRRR TO !EHE QOESa!IOHNAIB£ ON THESAURUS ISOBIEM5 



ante A.Sebancba^ 



ERIC; 

M/HimiirarfTiiaaa.- 



1b 1 tbesaurua alvays eonprlaaa aa alptaabetlcallj arranged, 
and may also contain a syatematlcally arranged display of 
evaluated aubject-orlented terms for indexing and retrie- 
val purposes* 

1b 'Do call a given construction a "thesaurus" the foUovlng 
semantic and syntactic principles have to be fulfilled: 
a/ The thesaurus has to be structured as to synonyms and 
quasl-synonymSf for instance by means of USB-ref erences , 
SEE AlfiO-references and USED POfi-references* 
b/ The thesaurus must contain definitions of ambiguous 
terms, for Instance by means of scope notes added to 
the term* 

0 / The thesaurus must be aoconpanled by a set of rulea 
giving Instructions for: 

use of the various graanatlcal forms of the words, like 
the use of nouns, adjectives, adjectives together with 
nouns, use of singular-plural form and so on, 

use of accepted abbreviations AUce "VTOL aircraft" 
for "vertical take off and landing aircraft"/, 

use of symbols like , //, greek letters, "per cent", 
"degrees centigrade" etc., 

use of trade names, 
use of geographic names, 
use of technical slang. 



^ 8tudlesels3capet for Eorsk Industri, Oslo* 



A 



76 




er|c 



- 76 . 

combination of elngle terms in the thesatmus to complex 
terms Indexing end retrieval, 

direct or Inverted entry of complex concepte consisting 
of more than one word /like "fluidized bed" and "bed 
f IxildizedV t 

the candidate terms most be evaluated e eubject epe« 
ciellst ae well as by e linguist. 

The abovementloned are the basic principles of thesaurus 
conetruction. In addition to these there are other prln-> 
ciples - the fulfillment of which give rewarding advan-> 
tages to the thesaurus ae a tool for indexing and re- 
trieval: 

A atructurleatlon of the thesaurus by meana of a hierar- 
chical listing of the terms or by a listing of broader 
term /BT/ narrower term /ST/ reletlonablps between the 
terms. 

5ueh a strwrturlaation will cause secondary practical 
effacte on the tbesaurua by giving poaaibilitiea for 
preparing alphabetical lists for groups and atibgroupe 
within speolal fields. These nay, besides sarving their 
main purpose /indexing end retrieval tool/, be of value 
for atudying the depth of Indexing by means of word fre-* 
qoeney studies • Such studlaa will in turn contribute to 
tbs davalopment of the thesaurus towards an optimalisa- 
tioa of uaafuXna^^Thia means that ths tbasaurus con- 
tains a aialnum number of rarely-used or too-oftau-used 
taxna and a naximnm number of relevant terms. It means 
further that the thesaurus should be supplied contl- 
nuouely with new evaluated terms and abould be freed 
from obsolete tarns by continuous delation. This last 
point Is in aceordanca with rapid tachnlcal development 
within the various special fields. 

1o The alamsnta or factors which Influanea the construction 
of a thaaauruB are maationed in lb. Besides these factors, 
the aim of the theaaurua ouglt to be conaiderad fMa case 
to caaai whathar it is going ?o serve a "polytaehnieal* 
/several branehsa/ or a mors specialised field, whether it 

^ 77 : 
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1b 8o 1^ to serve one branch /coispany/ or one subject field 
and whether It Is going to serve as a tool for both Index- 
ing and retrieval purposes. It is also Important to evalua- 
te the desired depth of Indexing from case to case. 0%is 
In tAim will Influence the thesaurus construction. 

1d The degree of complexity and the number of Information 

Items /terms and various presentation forms of these/ have 
to be evaluated In connection with the aim of the thesaurus 
/answered In 1c/, In terms of what purposes are to be ful- 
filled. The economy of the thesaurus project In question Is 
of course also an Important factor^ which will determine 
the degree of complexity. 

2. When applied to the use of the thesaurus for coordinate In- 
dexing and retrieval, the existing definitions found In 
recent literature aud used among docxuaentallsts /soehb thing 
like the definition given above under la/ shoxild be suffi- 
cient* If the thesaurus concept is to cover broader uaesy 
Its definition has to submit to further analysis. 

5* The role of the thesaurus Isy according to the answer under 
2, primarily for direct use as a most necessary tool In 
systems for Indexing and retrieval. Secondarily It can be 
used as a vocabulary and dictionary aid for other purposes, 
for Instance within scientific Information as a whole, such 
as for authors editing reports, journal articles, for lib- 
rarians In their choice of keywords for all kinds of scisn- 
tlflc library material, etc. 

4. The most rewarding way of collecting terms for a thesaurus 
is no doubt the compilation of evaluated terma from current 
Indexing of new literature in the field In question, be- 
cause this gives least "nolse^ with regard to artificially 
and seldom-used terms. It gives the current up-to^ate 
language. In this work there are obvious advantages inusii^ 
a computer when It comes to alpbabetlsatlon of the word 
collection, for preparing rlphabetlsed subject field lists, 
for construction of hlerax' dical connect Ions, for production 
of Inverted lists, pexmutaved lists, for word frequency 
atudies etc. 



78 



5 



• Of the topics mentioned In the questionnaire the first is' 
In our opinion the most important^ l*e. the classifica- 
tion of thesauri according to branches/subject* In fact 
both of these classifications can be of value* 

A Compaq^ thesaurus usually ought to be a *^anch** the- 
sa\irus» while a thesaurus for a scientific Institution 
ou^t to be more aubject-orlented* 

6* A descriptor Is a preferred keyword /term/ which In a 
semantic hierarchy has a place which may comprise a larger 
concept field than the one which Is covered by one single 
keyword /term/* 

’ 7* The answer to this question la given under 1b. 

8* The answer to this question is given under the last part 
of 1b. 
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EECOMUEHDAIIOHS poh the buhcihg op thesauei 
IH BCAMDmVIAH LAHSOAGBS 

Reglei for bygsins «▼ tbesauri pa nordiske sprajs* 



Extract in translation 

XX 

Bennlng Spang-Eannsen 

As to general points of view and as to order of presenta- 
tion, the reaonimendations are in accordance witb tbe Tbesaurus, 
Rules and Conventions found as Appendix 1 to the Engineers 
Joint Council’s Thesaurus of Engineering and Scientific Tenas 
/Hew fork 1967/* 

Definitions: 

By a document is meant the smallest hibliographic unit rel- 
evant to a system of registratton and retrieval. 

^ a keyword is meant aqy word or composition of words ap*- 
propriate in retrieval for the characterization of the content 
of a document. 

By B dARoriptor is meant /1/ a preferred keyword that /2/ 
within a conceptual /semantic/ hierarchy takes up a positior 
that /5/ nay cover a conceptual field larger than the field 
covered hr *n individual keyword; 

Purpose of a thesaurus: 

A thesaurus serves the purpose of indicating the tpansfg ; 



hy the Working tecw for 

Council for Applied Besearch /HOBDPOBSk/. 



» Danmarks Teknieke Bihllotek, Copenhagen. 
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of koyworda /found In docusients and In ratrleval queries/ Into 
descriptor 8 accepted tbs Information store of a ays.tem of 

registration and retrlenral. 

Contents of a thesaurust 
A thesaurus obligatorily comprises an alpbbbetlzed . list. 
This list should Include all keywords and all descriptors. 

In addition I a tbesaurxis may comprise 
1/ a permuted Index /allowing to enter from Inversions of nor- 
mal word order I e.g. from absorption acoustlc/| 

2/ a subject category IndeZi In which the descriptors are ar- 
ranged according to categories /facets/; 

V a hierarchical Index In which descriptors having B^or HT- 
references /cf • C-1/ are arranged hierarchically below the 
descriptor having the broadest meaning /covering the bro- 
adest conceptual field/; 

4/ a list of notations /e.g. numeric codes/ for descriptors 
and for corresponding keywords. 

Selection of descriptors: 
Descriptors are selected In accordance with their use In 
documents for Indexing. In addition to all conceptually Impor- 
tant words t forming the core of a thesaurus and being closely 
affiliated to professional terminology » descriptors may Include 
names of persons* Institutions, projects, locations, etc., and 
furthermore numbers, numerical Intervals, and other kinds of 
identlficatory symbols. Sven blbllogiapblcal Indications may 
act as descriptors. 

The usefulness of a descriptor depends on 
/a/ whether It la likely to occur In retrieval queries as an 
element conveying Information; 

/b/ whether It can be given a meaning distinguishable from the 
meanings of other descriptors | 

Beeomm. ..• Scandinavian lang. -2 
/c/ whethe «t can be given a definable rneatii:*; of general un- 
(dersta lability. 

Not useful as descriptors are words 

/a/ never occurring In the relevant documantsi 

/b/ occurring in any documenti 
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-.si- 
ze/ ttoiTBs oeonrritig In retrieTnl 
/d/ hifvlns a vagna and nndafinabla aaatiingi 
/•/ haTing a aaaaiog vary eloaa to tbat of aoaa already-aceep- 
tad daaesiptor. 

far daaerl^ora ttaa Singular of nouna la 
t»ed.* *-3 

Inataad of dlstingulahlng la tha ray of tngllah laftiiidlog 
and IndleatoTO of content atiould be naad uhaa naeeiK 

sary, a«g« building /proeaaa/ ^versus buildlntf Abiaet/ . 



Indleatora of eoataati 
In eaaaa of doubt as to tba naanlng of a word to be accep- 
ted as a dMcxlptory tha aaanlns ahould be pointed out ad- 
ding an Indicator of eoatent /a aeope note/. Tha Indicator la 
glran In braolDeta, but fcna part of tha daaeriptor* Tha fol- 
lowing eaaaa daaarte nantiont 

/I/ Caaea of honcayny, e«g« Sngllah /n arc u rr ^talZ and Marcmqr 
/i>lanat/ > 

fZi Caaea share a noun ',ia noanouly uaad both ^of a prooeaa and 
of a product or a p ropert y /of* ^ above/. 

/?/ A eoapoaltloa of norda nay have different naanlnga according 
to saxylng aaaaatlc xalatibna batneen the conponanta« and an 
Indicator of content nay thua be dcairablci e.g. ingllah 
mtT COOUl» .1th IWf r/ VW*. WtW eoollM! 

/aoollap irf f 

/V Trade na^ end pciaaibSy other nanaa nay need an indication 
( of the hind ^ artie^ in qjomatton and/dr of the fact ithat 

tha nana in ^niatien ia a regiatered trade nadt. 

0 a n p 1 a n : d a a c r 1 p t e r a * r a • c o o r d - 
t inatad daaeriptor a/s T-11 

I Sbnally daaeriptora ahould be chcaen in aeeos^ncei with the 

teraiooiogy found in tba relevant literature regardleea of the 
V xtmBbmr of vorda uaad in crdar to axpraaa tha coneept/tba n aa nln |^| 




Tha rulea 
deviate, fron the 



bare for Seandinerlau 
Anarieaamlea. 



langnagaa thna 
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In question. Howevert oany potential descriptors seem to ex- 
press concepts that are combinations of tvo or more pther 
/potential/ descrlptora. In such caaea a decision must be made 
as to the Inclusion of the specific t complex descriptor or the 
treatment as a combination of descriptors. 

Specific • complex descriptors often facilitate the re- 
trieval of specific Information t but Increeslng the number of 
descriptors generally Increases costa of Indexing. The use of 
Individual /not complex/ descriptors by Indexing # and combin- 
ing them dxirlng retrieval! permlta a smaller thesaurus and of 
a more consistent terminology^. 

Becomm. • • • • Scandinavian languages 
/a/ A specific! complex descriptor should be set up# If rele- 
vant aol more general descriptors are iu>t found In the 
thesaurus In question. In order to be relevant a combina- 
tion of more general descriptors must comprise at least 
one descriptor that la a mambar of the sane hierarchical 
claaa as the specific concept* 

/b/ A specific y complex descriptor should be set upt if the 
specific concept is Itself frequently met wlthfOr If one 
/or both/ of the more general descriptors is very frequ- 
ently met with* 

In cases of doubt It la advisable to set up the specific, 
complex descriptor, at least tempcrarily, because In a wortring 
aystem It is easier to split up a complex descriptor than to 
combine existing descrlptora Into a new, complex descriptor* 

Cross references: C-1 

Ihe connections between the descrlptora and Jceywords wi- 
thin a theaaurus are Indicated by rafaranoe symbols having 
fixed meanings* In thesauri In the English language the most 
widespread eymbols are /abbreviations of/'ln^lsh words 

USB UB /used for/ BI /broader tm/ HT; /narrower texm/ 
BT /related term/* 

In a theaaurus In a Scandinavian language It Is not recom- 
mended to Introduce abbreviations specific ^to this language, 
because confusion may arise. It is advisable /I/ either to 
accept the widespread English abbreviations, regarded as pure 
symbols, or /2/ to standardise - If possible - new, figura» 
tlve symbols, e*g» 
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> < A V // 

far IXSB UF BT BT BT 

vUere A and 7 are to be read as latter aubstitutes for arrows 
pointing reapectiTeiy upwards and downwards. 

/More detailed definitions and instructions with examp- 
les for the 5 types of reference are giiren in the original as 
sections 0-2 to 0-9. These, and rules for alphabetising /sec- 
tion A-i/ are omitted in this extract; from the numbering it 
will be seen that even sections T-2, T-4, T-6 to T-10 have 
been omitted. This translated extract tends to cover points 
of particular interest as to principles of thesaurus construc- 
tion/. 
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IfiOBISMS OF TBSamU 
Jlxl {DouvF 

Tb« questionnaire does not contain questions which iioet 
probably could Incite the participants to discussion - nanely 
questlone about the problens end the disadTantages of these- 
url* As ^ host has Invited us to add freely new points to 
the questionnaire t this paper teles to concentrate especially 
on both these aspects, m order to 3ceep to the order cf Iteas 
as shown In the questionnaire , I shall deal with the indivi- 
dual points In the order of the questionnaire Itens. 

ad 1. - 2. 

One of the most Iwportant aotlvitles of librarians end 
Inforsatlon specialists is to index or classify the Infoma- 
tlon aaterialSf l»e» to ei^ress the topic of documents hodefly 
by means of terms /called subjeot headlngSi uniternst des- 
criptors, key-words etc*/* fbe terms representing the topics 
of documents can be arranged either alphabetically- according 
to the external form of the term - or syattmatlcally- accor- 
ding to their Internal content. There is no third way of ar- 
ranging them. These two ways of arranging the terms corres- 
pond to the two main trends of ordering systems , namely 
alphabetic Indexing systems /like subject headings, unitems, 
thesauri of descriptors, key-words/ and systematic clasalfl- 
eatlon systems /both traditional and faceted/. 

In Bnglish there unfortunately does not exist a broad 
term for the two narrow terms ^'indexing” and "classification”* 
In Csech we use the broad term of "ordering systems" /simi- 
larly the Germans use the expression "Ordsagosysteme"/. 

^ Csechoslovak Aoadsmy of Soienoes, Fragos. 
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In this paper the words "ordering sje terns" will he used to 
express both alphabetical and classification STStema. 

The thesaurus is an alphabetical system. It is necessary 
to stress this» because in some cases this expression is used 
nowadays to describe any special ordering system /as distinct 
from a universal classification system/y even if it is syste- 
matic. Unclear terminology would exclude discuasloni therefo- 
re by "thesaurus" we should always understand an alphabetical 
system. 

ad Bole 

In conventional information systems it is imBediste3y 
evident whether the catalogue is alphabetical or systematic, 
because the records in the files are arranged in one or the 
other way. In a mechanised system we cannot Judge the charac- 
ter of the ordering system according to the arrangement of 
records. Uost frequently their arrangement on magnetic tape 
is chronological. 

Indexing and classification 
fulfil an important role- 

naaely to arrange the 

records in storage 

Svery classification system uses systematic tables /syste- 
matical display of terms/ and an alphabetical index /alphabe- 
tical survey of terms/. Which of the alphabetioal ordering sy- 
stems, whether subject headings, uniterms or thesaurus, uses 
both of these aids? I somehcw doubt that graphic maps of terms 
in a thesaurus correspond to tha tables of a classification 
system. The lack of a systematic survey is a great 
tage of many thesauri . 

ad 4. Construction 

Referring to my papers presented at the Xlsinore confe- 
rence in 1964 and at the Second Inglo-Csechoslovak Conferen- 
ce of Information Specialists in London in 1967 I shall repeat 
briefly the main ideas of the or ering prlneiplea and the 
theoretical considerations connected with its 
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gyeig ordering system ia goyer ned by a combination of 
orderliig. prlpclplea ^ whether alphabetical or aynttAir^nwr 

Nearly 9t9Tj ona of thasa principles is in contrast with 
another principle: 



Principle 

alphabetical arran^nent 
of taxsBs 

uncontrolled dictionary 
no hierarchy 
open^nded system 
precoordination 
alphabetical display of terms 



Contrasting 

principle 

systematical arrangement of 
terms 

controlled dictionary 
hierarchy 
closed system 
postcoordination 

systeoatical display of terms 
/tables/ 



In addltioti to this there are two principles which have 
no antitheses - the principle of categories and the principle 
of expressing texms by notation. Both these principles are 
characteristic of the classification systems. 

The chief idea expressed in the paper presented In London 
was, that there was a general t endency of theae prlnclnlea to 
aerge and that the best sol ution seemed to be achiewd fchii 
syathesia of opposing nrinciniaa t 

Unltexms bare no hierarohyi while TOC naes a strong hierarchy. 
Neither an excess nor a la<dc of hierarchy are good. A weak 
hier^chtL therefore rnmmmm the best solution. 

An uncontrolled rocsbulazy has great disadrantagea, but 
similarly a controlled rocabulazy. • as we all feel - very 
often has ths dlsadrentsge of a Procrustean bed. in ay opi- 
nion there will be e gensral tendency to add uncontrolled 
words to_the records. I call tham "e xpUcatlwe worda<» sns 
they have, no aaleotlwe pofe^ /no search can be made on their 
basis/. We hare introducad these words in our mt^ha- 

nised infoBBStion system, and ttay serfs ns well* fhey ere 
very Important in the computerl' ed ayatems ^lexe bibliogpapl^ 
baa replaced ebstracta and wher. tha strict use of a control- 
led Tocabulery endangers tha underatanAlng of the oontent of 
a document. 
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Frecoordloatlon or postcoordlnatlon - that Is the question. 
In ^Judging this problem «e must Introduce Into our consldere- 
tlons the question of whether the ordering system Is used In a 
bound Index I In a file or In a multidimensional medium like 
peek-a-boo cardSf punched cards or a computer. 

The conventional systems /bound Indexes and files/ prefer 
precoordlnatloni and multidimensional records prefer postcoor- 
dlnatlon. In spite of this I think that the beat way of .judging 
whether to precoordinate or nostcoordlnate Is to ask! How will 
the user seek the Information? If he Is likely to request a do- 
cument about the "circulation of journals", he must be given 
the possibility of searching under this precoordinated cooplex 
term. If he is likely to request also documents about the "cir- 
culation" of ether materials, It is necessary to enter this do- 
cument also under this postcoordlnated term. And If ha Is not 
likely to demand the records about "journals*' In general, we 
shall not classify It under this term at all. This way of sol- 
ving the problem also means a synthesis of two principles /pre- 
coordination and postcoordlnatlon/. 

Last but not least come the two antitheses - alphabetical 
and systematical arrangement of terms. Can we ever reconcile 
these two principles? The answer Is affirmative: moreover this 
synthesis can be applied In two ways. In the first place by 
using both an alphabetical index of terms and a sTatematical 
display /tables/ In everr ordering system . 

However, within this, synthesis can be used even In the 
ordering system Itself . We have dona ao in our faceted classi- 
fication In the INDQBI8 Information system. This faceted clas- 
sification consists of four categories. The classes In the 
categories are arranged alphabetically according to tho capital 
letters /which were chosen mnemotechnloally/. The terms In the 
classes are also expressed by mnsmotechnical notatlon/ln small 
letters/ and again arrangsd alphabetically. The system is a 
faceted claaalfication with alphabetical arrangement of 
and terms . 
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ill the conslderatioaa about 
ordering principles appij both 
to alphabetical ajstens and 
elassiricatlons 

ad 3* Interhranch relations 

IThesaurl are special ordering sjstaasi this Is their great 
disadrantags In cooparlson with universal classification sys- 
tems like 2>*C*| ?*2>*C*i Soviet Bibliographical-librarian Clas- 
sification ^ Library of Congress classification etc. I doubt 
whether a univeral thesaurna could be built for a simple reason 
- when a thesaurus consists of more then 600 - 1,000 terms, both 
the Indexer and the user lose the survey of the thematlcal con- 
tent of the Information system, i larger system needs uncondi- 
tionally thematlcal tables /or at least maps/. 

ilthon^ one is aware of all the dratbaeks of different 
xuilveraal classifications, It la at least certain that they have 
not the great disadvantages of all special thesauri- the problem 
of marginal fields and the problem »^T^| Ilary terms . 

In every theasums the terms for marginal topics /ohealstry, 
pl^loa, mathsmatlca/ must be Introduced again and again /alwdys 
In a different wayA There Is no compatibility, although so many 
thesauri need theee terms. Bvaluatlng the situation In Burope 
now, one realises that the only coordinating factor in the ne- 
tvoric of information oentres - International and universal ola- 
selfleatlon - Is disappearing and the network la disintegrating. 
We are eommlns to an atomlalns etage - every branch and every na- 
tlon /In th. mm h>.nnhe./ia ooBatruetlng ite own the.aurua.aila 
la a prohlam of capital Importance. On the other hand the Intro- 
duction of comcputerlaad svetems <i een analvela In inde3&» 

ins - and a special ordering system i^hether it be the theaanrua 
or a facated claaslflcatlon/ . Bow to aolve this prohlam? It la 
cartalnly one of the greatest problama of this and other oonfe- 
renoae on the prohlame of oo^derlng eystame* 

Ky pareonal opinion Is. that wa nacd a eanaral classification 
svstam forming a roof above the special ordering syatama /both 
thasanri and faceted classification In iadivldual branchea of 
human knowlediBa/ . ?rom this universal classification, trtiich 
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votald Bot fo ifito dotalX /ttiis would X>e the took of op»* 

cial 0j8t6 s/> tft thould UM tbo toxf« for all marginal fielda 
and all ttaa muzUlaxT tam* 

Who will work on tha cooatrttotlon of thla aniraraal olasal- 
flcatloQ? In spite of the waluable wosBc of tbs London Claaslfl- 
cation Research Group It is clear that It would take reeve to 
construct a new system of unlTersal faceted classlfloatlon* And 
recomtructlng UDC for this purpose would also d^nand great 
efforts. It Is obTlous that besides thlst It will be necessary 
to state general prlncloles for the cooatructlon of special or- 
dering systems * 
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BUHAIBa OF TBK SBBBAORbB 
BaTid C. Waaiks^ 

Tvo principal oonaldaratlona in approaching the task of 
tbaaaurufi huilding arat the domain nhloh la to be rapraaantad 
and tha aonroea from nhloh ooncepta and taxma ara aalaetad* 
Wharaaa aoiantific and taohnloal infonation ayataaa baro 
taodad to davalop indapandantl^ in tha past* wa must raoognlaa 
tha inbarent Haitationa of such a randon pattam and aaak 
maana of onrarcoBing thoaa llBitationa* Infloaneaa halplng to 
ravarsa this tandancF atom from an anaranaaa of tha ftaqnant 
onarlap batvaan dOBaina or anan dlaoiplinaa and from a groving 
naad for aTatam^to-aTStaB ooBBunioation* 

Tbe impact nt thaaa inflnanoaa la apparant whan wa oonai- 
dar tba aloBanta that ooBpirlaa tha langnaga of a thaaaorua da« 
▼alopad for a apaolfio application in a dlaoiplina. Vota. how- 
avari that tha dlaoiplina ia Mat probablj a auhdlaolpllna of 
aaaa oonprahanaiwa diacipUna. fhla la largalj a zwflaotion of 
tha trand/towarda ayataaa aarring narronar fialda bat tiaating 
than aora intanalnalj. 

In thaaa oironaitanoaat a diaoiplina-oriantad thaaanraa can 
ba axpaotad to contain four Toeabnlariaai aaoh aaaantial to tba 
litaratura of tha doBain and tha. retrianal intaraata of its 
uaarat 

• tha ooxa ooneapta and tazna of tha dlaeiplina - tha Xango- 
aga aaaantial to raaaaroh and ooBBimioationt daaerlblng tha 
objaota of oooeam and aotinitiaa of tha praetitiontr. 
tha aajor dlaoiplina of whloh thla doBain fozM a part « 
tha aoiantific biararohj iUnatxatad bf tha doBain of 



^ Qaorga faahington Uninaraitj BBOP, laahington* 
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entomology which is a sub-set of invertebrate zoology - 
zoology-biology . 

. other disciplines upon which the scientist or technician 
draws in the performance of his work. These may be cross- 
discipline domains such as chemistry/biology, physics/ 
chemistry; or they may be used as tools of research such 
as statistics, logic, mathematics* 

. the domains where this discipline may be applied - best 
illustrated by the application of information science to 
any scientific discipline* 

Unless each of these elements is represented in sufficient 
detail, the thesaumis cannot be effectively employed as a tool 
for information control* It will fail to .provide access to the 
varied topics contained in a corpus of documents and it will 
be unresponsive to the user's needs* As the interest in and 
need for system-to-system communications becomes more intense, 
the task of b\iilding a thesaurus becomes more complex* One way 
of reducing the difficulty and at the same time promoting com- 
patibility is the development of standardf uniform methods for 
thesaurus building* This will encourage the symmetry and ba- 
lan ..hich are so essential to conversion from one system to 
another and for one system to complement another* Another meats 
to the same end would be the development of standard ccn^onents 
which could be incorporated into several thesauri f since the 
non-core elements are common to a number of domains* 

Both possibilities req;iire the combined efforts of spec- 
ialists from various disciplines* The result of such projects 
would greatly assist in halting the random development of 
uncoordinated thesauri and of incompatible information systems* 




Sources of Concepts and Terms 

The term thesaurus is becoming a common name for any infor- 
mation system, indexing scheme or language* At the same timet 
the concept of a thesaurus is also frequently equated with a 
somewhat structured language in which broader, narrower and 
related terms are arrayed - not always according to some for- 
mula or rules that prescribe the level of terms* On the question 
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of methods of eomplllag a t;hesaurus« there are at least three 
principles that must be considered In every instance, whatever 
techniques may later be applied to the actual task of compila-* 
tion^ These are: 1/ The source from which terms are to be se-> 
lectedi 2/ The purpose for which the thesaurus is intendedi 
3/ The approach to construction. 

Two conventional methods have been used for the selection 
of terras. One method is the use of existing dictionaries and 
other lexical aids. This approach* more likely to be used for 
a di8Cipline*oriented system* tends to produce a set of terms 
that is representative of accepted uaage. Terms are likely to 
be relatively stable and to reflect a concensus of definition. 
Most disciplines have a number of technical dictionaries as 
well as encyclopedias* handbooks or similar basic resources 
from which terminology can be derived, k disadvantage of this 
method is that the vocabulary so assembled is somewhat "steri- 
lised" and lacks the dynamic quality of current usage. Such a 
thesaurus may be difficult to relate to the language of li- 
terature or the patterns of recourse characteristic of poten- 
tial users. 

A second method - extraction of terms from a selected 
sample of documents - uses primary rather than secondary so- 
urces. Thesauri built from the language of literature are more 
current* but risk the need for more frequent amendment as the 
scope of literature is broadened in actual use. Terns of less 
stability will be more frequent and defining notes more neces- 
sary to specify the accepted meanings.Uission-oriented systems 
are the more likely applications for thesauri constructed txan 
literature . 

The purpose of the thesaurus is another issue that is as- 
aential to its construction. Those intended for application 
to the literature of a discipline are started with a different 
set of premiaea from thesauri being developed for an organi- 
zation with a defined misaion and functions. While a core of 
terms should he similar for a given discipline and a related 
mission* considerable differencs is likely in the peripheral 
vocabulary. The two varieties are probably not interchangeable 
in application. 
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A thlxd ooiisldaratlon In antbods of conpllAtion is the 
qusstlon of stmotnrs* In soae instsnossi tbs stzuotnre is 
built fizst sod tsxBS are sslectsd and assignsd at t)m tiaa 
of anttgr. Oaij in tbls waj can a valid and voxkabla tbsaauxua 
be oonatructad. Bavavsing this pirocasa and building a atmc- 
tura after tba taras axa ooUaotad eraatas an unavan and 
imbalanead strueture not trul^ raprasantatlva of tba topical 
univarsa it is aeant to defino. 
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BOMB HUXIOSEUIOTIC fBOBLBIG 0? SCIENTIFIC IRFQBBUTIOE^ 



Tadeuax W6jci]c" 



Fraxiosealotles /the theory of optimal algn/fprazlollQguls- 
tlca /the theory of optimal language/« the claesifioatlon 
theory • the theory of the designation of the message /the entry 
characterising the mesaage/i the theory of aatomated informa- 
tion retrioTal - all these tranches of science play an Import- 
ant role in the deiYelopment of scientific information ani its 
theory. 

In ay report to the conference I present the principles of 
ay work on formal aspects of the general classification theory 
and the theory of optimal language. I hope these oonslderatlow 
will in some way stianlate the deyelopment of the central prob- 
lem of scientific infomation« i»e» the theory of retrieTsl 
langoages. 



^ This report is based on T»W6icik: Frakseoseaiotyka. Zarys 
teorii optjBalnsso snaSni. /Fraziological Semiotics. An out- 
line of a theory of the optiaua sign/. Warssawst 1969« 

PIN pp. 289* 

** Warsaw UniTorsity, Warsaw. 
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1. General description ot tte 
processes of i n f o r . a t i o ,n 
end cognition 

I hsve subdivided cognition into sensory, symptcwtic and 
sign-message successive stages of cognition, termins.ing 

with tbs exclusively buoan stage. 

Tbs sain proble* la that of aessage shicb is a tool for 
convl^iog an unlnfora^i «an, anlaal or aaobi« into an i^or- 
med one. Both message and any of its component-parts, 
various degrees of complexity, are signs. 

A message may be in tbe form of a sentence, or la^er 
text, photograph, portrait, map. sraph, a- chair 
a tabirin a cafe to indicate the place is taken.a doorbeU, 

A message may act In diverse forme and especially as sound 

and ligdt i»ws. -.v. a. d*. 

The postulates of the optimum message require that it be 

best adjusted to throe sides, namely: 



1, to the addressee of the message » 
a, to its sender or c3peator, 



3. to the reality rcmmunicated. 

These postulates are compared eith the corresponding postu- 
lates of the optimum tool in general, idiich in this case is a 

postulates of the optimum message are the following: 



eoncetning addressee of the message, 

— -rtiniB access /in space and time/, 

2. perception of the form of message, 
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3* maxiaaua clarity of message { 

2* concerning the sender or creator of th^ message: 

1* maximum ease of learning the technologjr of making and 
sending the message | 

2* maximum tacility in xaaking messages /both their content 
and form/ and sending them; 

3* concerning the reality which a message reflects: 

1. unambiguity of the message, 

2* maximum correspondence between the message and the 
reality it reflectsi 

3* optimum exactitude of the message* 

To a large degreoi though not completely, the message de-* 
pends upon its components being optimum in order to achieve 
an optimum effect* 

The code of message K is the classification of the set 
of components of message M* The postulates for the optimum 
code are: maximum reconstructability of the message from, the 
components of the code, perception of the elements of 

the code, and «***"«» ease of learning its elements. Generally 
speaking, the optimum code of message 1& is that classified'’ 
tlon of a set of elements that is necessary and sufficient 
to make the message K optimum* This is the postulate of 
isomorphism between a set of elements of the reality described 
and a set of elements of a code that satisfy additional 
postulates with regard to the addressee end the sender of the 
message* 1 one-to-one correspondence la a particular example 
of Isomorphism* 
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2 * Classification codes 



Classification codes are codes of classification messages 
/i.e.. classification/^* 

One of them will be described here since it seems necess- 
ary for the understanding of the language code* 

We make a table of classification assumptions to classify 
the set S with respect to features A| B«***N* The columns of 
Che table will contain successive modifications of the features 
Bt*..» N/see Pig* 1/. 
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- classif icandum 

-criteria of classi- 
f ication/nonprecise/ 



- general forms of cri- 
teris of classifica- 
tion 



precised /detailed/ 
forms of criteria of 
classification 
/modification/ 



- criteria of classifi- 
cation in precise form 



Pig* 1* Table of Classification Assxmp- 
tions /formal structure/ 



We may malce different clasaification codes by transforming 
the components of the table* It is sufficient to show in ad- 
dition ti)e substance of the claasification /notation/* This may be 
the codes of a "tree**i a tablet > group of wordsi letters or 
digits, etc* 

I have enclosedi in brief, a method of making an ordered 
classif icatj meeisage in a positional digits, ^^lotation which is 

^ The author* has described the classification code in a much 
more complete form in his book Zarys teorii klasyfikacji* Za- 
gadnienia forzsalne /Outline of the Theory of Classification* 
Formal Problems/, Pahstwowe Wydawnictwo Naukowe, 1965, 184> pp* 
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a particular example of the naming notation* The symbols oon<- 
sis t of digits which are detailed numbers of lines Including 
precise forms of criteria /modifications/. The position of a 
digit In the classification symbol Indicates the number of the 
column which Includes the corresponding modifications. Hence 
the positional notation. The resuiltlng ordered classiilcatlon 
Is called systematization. 

The components of a message /parts of a classlf Icatlon/are 
placed in a permanent position In relation to one another so 
that the classification message will be ordered* 

Directives for making an ordered classification of the set 
P with respect to A /in detail: B /in detail: 

* ^1 * • • • » ^y^ * * * * /^^ detail : » • • • i N^^/ • 

0. M^e a table of classification assumptions of the set P 
with respect to the abovementioned detailed criteria 
/Pig. 1/. 

1. Uake an ordered Cartesian product of sets: 

sf X /A^yA^i **•! Kj/ X /Bq |B^ f • • * f By/ X • . */H^,K^ t • • • t K^jj/ 

which are mentioned In the table of classification assump- 
tions. 

2 . Bllminate the elements of tbs Cartesian product iftiere the 
digit 0 precedes every other non-0 digit /e.g. Oil, 201/. 

3* Insert the resulting symbols Into columns containing the 
same number of non-0 symbols , 

In Increasing order, and keep the order of the Cartesian pro- 
duct. 

By following the above directives In order, we get the fol- 
lowing results In the case limited to 2 modifications, each with 
three characteristic criteria: 



Directive 0 
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Pig. 2. Table of Clasaiflcatlon Assump- 
tions /digital notation/ 
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0 l8 ttie sTmboX for a classif Icatlon oriterlua wbiob la 
disregarded in a particular case. 





Directives 


1 and 2 




P 000 


P 100 


F 200 






P 101 




P 201 


P 001 




P 102 




P 202 


P 002 






P 010 


P 110 


P 210 


P 011 


P 111 


P 211 




P 112 


P 212 


P 012 






P 020 


P 120 


P 220 


P 021 


P 121 


P 221 


P 022 


P 122 


P 222 



Fig, Cartesian Product, dbe eliminated 
eleiDents are framed 
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A set P of cloths from s textile factory will serve as 
an example. 

The set la classified /systematised/ according to the fol- 
lowing criteria :A: substance of cloth /natural, synthetic/, B: 
thickness of cloth /thln^0»1 mmi tblck^ 0|1 mm/y C: color of 
cloth /natural, dyed/* 

Set table of ordered classification assumptions is the 
following: 



Directive 3 




Fig. 4« SyitemAtUatioa Message 
in a Poeiiional>digital Notation 
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5* ^ble of Closaiflcation iasumptioos 
/axaople/ 

Tho classification componenta of the different cloths are 
as followa 

F OCX) s cloth /systematised aet/, 

F 100 s cloth of natiiiral suhstancei 

P 210 M thin cloth of ayntbetio stbetancey 

F 121 s thick and undyed cloth of natural subatance, 

F 002 8 dyed cloth* 

The table claasiflcation la a particularly useful form of 
elaasif ieatiOD meaaage* It has been widely applied to the elaaai- 
fication-terminologlcal atandards in this country aince the 
numerical notation is brief* 

Language codes 

Praxlologlcal linguistics is a part of praxiological aeml- 
jotica*The language is a classification of a aet of aentehtial 
mesaage element a* The optimum language contains exclusively the 
neceaaary and sufficient conponenta for optimum aentential 
measage to be in as aesthetic a form as possible* 

Two main tasks of pra:<^ .ological linguistics ares 
1* The task of cognition facillates research on the basic lan- 
function on the basis of the simplest construction 
« /praxiologioaX model of language/* Such a construction would 
alao facilitate a deacription of natural languagaa which may 
be hypothetically regarded da a deviation from the model* 

2* The practical task of direct utility ins 

1* research on problems of machine translation* 

2* research on the creation of the optimum language* 



The optimum laDguage« while belog primarily the laoguage 
for scientific inforination« would enable direct communication 
of anybody with anybody else* Uoreovert it would enable the 
formation of a better picture of the worldt thus helping men to 
function much more efficiently intellectually* The problems of 
praxiological linguistics lie within the field of interest not 
only of logicians and linguists but also nraxiologistst cyber- 
neticistSf psychologists and sociologists *To get to the root 
Qf the problem I have suggested an extensively developed group 
of graimaatical terms* The formal structure of the optimum lan- 
guage is given priority in this elaboration which consists of 
syntactic and phono-graphical structures* Ifext come the gener- 
al principles of building the semantic structure of notion 
groups^ the content-strucnure* The last problem to solve is to 
establish a one-to-one correspondence between the formal struc- 
ture and the content* 

Particular consideration should be devoted to the syntactic 
structure of such sentential measage components with increasing 
degrees of conplexltyt es an element of a word /roots i affixes t 
endlngs/t a word« phrase /a group of words like the subject and 
its modifiers/^ a simple sentence« a compound sentence* 

The classification algorithm of the optimum language is a 
set of instructions on hov to make sentences or phrases at dif- 
ferent degrees of complexity out of their lexical elements* The 
algorithm also concerns the making of phonetic structure of the 
language* Thus the algorithm is a formal instruction on how to 
form language expressions* The classification method also touch- 
es the problem of morphology - the problem of making correct 
notion groups* Althou^ the problema of the formal aide of tdie 
language can be easily solvedf the content requires thorou^ 
research* Such reaearOh should concern the entire set of no- 
tion groups chare cteristic of different branches of science and 
reflect these notions in the forma of the optimum language at 
the formali that ISf syntactic and phonetic level* Thus the 
algorithm can only conoem the formal aspect of the language* 
The optimum language model cote is ts of the following parts t 
1* the classlficatloii algorithm 
1* of the syntactic atruoturoi 
2* of the phonetic atrueture. 




2. morphemes I 

3* examples of some semantic solutlonsy 

the main principles of a oDe<-to*oxie correspondence between 
the form and content of the language* 

The classification algorithm of the optimum language con- 
sists of a table of classification assumptions of the set of 
sentential message components i and an instruction how to make 
the ordered Cartesian product of the elements of the table* 
Pig* 6 shows an outline of this algorithm* 
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Fig* 6* Table of Classification Assump- 
tion /message/ /for table Oh 

< A X B X ... N > 

W.’. can get an algorithm of the syntactic or phonetic struc- 
tures by substituting the variables accordingly* 

The syntactic algorithm can be obtained in the form of one 
table of assumptions* However« it would be very large and hardly 
intelligible. Therefore^ for the sake of conveniencet we make 
four tables for their respective components of different degrees 
of complexity: one for the compound sentence; one for the simple 
sentence; and one for the component of a phrase or word, re- 
spectively* 
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Pig* 7* Table of Classification Assumptions /compound sentence/ 
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Each of the four tables helpa to obtain two results » 
namely: 

1» A group of ^oeral atructures of a definite kind - symbol- 
ized by 0-1* 

2* A group of language expressions of a definite kind - subs^ 
tituting an appropriate expression for the variables. 

!Tbe group contains 4 032 possible sentence structures* 

An algorithm of the phonetic make-up of a syllable is 
based on a table of claaaification assumptions like the one 
below. 
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?ig*8* Table of Classificrtlon Assumptions /aimple sentence/ 
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Indicator 

Attrihtttor 

QnaotlOcr 


Word 

CofvtnncUon 


Indicator 

Attftbator 

Qoaattter 


Word 

Ooidnactlon 




ladlealer 

Attrlbattor 

QwatlAar 
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Fig *9* Table of Classification Assumptions /phrase component - 
V V*, extended/ 
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A syllable cuDslstiog of two Initial and two final conson*- 
ants is regarded be re as tbe most complex, ^ere are tbe fol- 
lowing types of syllables /here 0 stands for a vowel and 1, 
for a consonant/: 



00 - 1-00 


0 


00 - 1-10 


01 


00 - 1-11 


oil 


01 - 1-00 


10 


01 - 1-10 


101 


01 - 1-11 


1011 


11 - 1-00 


110 


11 - 1-10 


1101 


11 - 1-11 


11011 



I 

I 

IT 


; Word ConpMMDti 


Rtmantte pmliloMr 
1 ^ 


tool 

(Sftd) 


Bouantto pnotalooer 
8«s>F 


1 

8 yo tactic pr»< 
ctekNt«r<SjaC) 


:! Bern lx 






fiemfi 1 


Ban Ft 
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Jtf 
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Pig. 10. Table of Classification Assumptions /word/ 
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nil 


nil < 
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1101 


110) 


1101 


0 1 0 j 


loot 


1001 


1001 
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1110 
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! 1110 
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noo 


IIW 1 


1 1100 
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1000 


1000 


lOOO 
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0000 1 


! i t 



i Pig. 11 .Group of structures of Simple Sentences 
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Pig. n2. Sjlleble 



Such is the thoroui^ aescriptlon of the clasalflcatlon al<> 
gorlthm of the formal atructure of the optimum languages the 
syntactic and the phonetic structure. 

The grammatical morphemea are the following setss 
'Xm characterleer e.g»i it’s so that» lt*a not so that^ let it 
. be so, ia it ao? 

2 . aentence conduction e.g.i or, and If... then, if on2y^, 

3 . quantifier e.g. : 0,1..., certain, every, few, mapy, 

4. word conjunction e.g. s or, and, nor, together with, 
ayntactlc preclaloner e.g.: object, feature, relation, man- 
ner, place, time, 

6. aemantlc preclsioner e.g.s -nes8, -ry, -ism, -logy, etc. 



I have given some specimens of lexical morphemes in this 
account. 

Namely: the table of ayntactlc precisloners and some semantic 
precisloners /relator precisloners, basic and auxlllaxy/, pre- 
cis lonera of grafts t ‘ relator precisloners of sttrlbator 

/some spec^uens/. 

Besld' 3, I have given the table of personal pronouns and 
8 few quantifiers /numerals/. 



SetiiADlle 



STitaeile iftdHoaarwtiBUcUa 
tm of «o«d 




Objeei 



DcMribad 



DcMTlblnft 
ft fttftte or felatioM: 
ft kind of rzAUsooe 



iMiCMft flttettoa of void 




Owteifttof 




AttilbaU&c 

Attribntor 

QvftAUftar 


FidwUoo 

CooJiueUoB 










Word 


SeaUoee 


III 




Ob)MtftUiker 

■0 


1 


Object 

•« 






III 

■ 


i 


tefttura 

■ftriter ofanob|eei 
•1 


1 


•PcftttUft 

-• 






h 


l'«fttuia niftriter of 
typ* of exlftteiioo 


.1 

s 

< 


Attrlbutor of type of 
•xtfttcDn 







* X totrodOM ftomc pnMMttoftftMe ftoowl TftflftMea. 8otw pcoctefoocfi ore Dull (for ft few texfoU groiiiift). The ocxt exompleft 
ftie dartvod bom L«iJo>. 



Tig. 13* Syiitaetle Preelslooex 



Semantte pncWoMT of leUtor (bftrie) 
i Typo of eshdoMe 


Sptobol 


Raiteor ofptanM 


1 


t 


1 


1 flBMfftl(tOCXlet) 


•oil 




S 


ll 

i * 
£ 8 


Oftooftl (ie fuBolloa) 


<«l 




1 


Xodopndeat <o cluBfe hito) 






i 


1 Mtlw (to cteafft telo) 




.«ft> 


B 


' *^*'™^**^ 1 piteoi (to be ehoagte lato) 




.Of* 


*. 


1 Bxtift'ltaetlooftl (lo bft »...) 




— 



Pig. 14. SesDantle PreelsloDftr of Helator /basic/ 



o 

ERIC 



E«MnplM (dirivativM of Htefm**] 
tmmi — twnpei t oio 
bv h r m r- cold 
pultniH ~ fipooi 
po$srmi -> cool 
b oi dr i iit ~ tepid 
fti l if m f — hoi 
pUermi hoot 
pm tm i — wwmth 



bidtr in a n — to oool ’ 

pidifwioii — te Ifoteo 

py t mm^ j no ffoeaer 

bo ftm i oo — to ftir<ooiiditiQa 

t i l tnii q/i io -* nwUoter 

pidiniiofo fi pniim 

p M f armoiii -- fioooteg prooaw 
ate. 
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Abto*«U pidAtor 


Conp«ratlv» cndstor 


1 


mmtf 




U« 


< 


|3 


tttf niall 




PP^ 


Qw waattert 


Quit* tB«ll 




fli 


< 




Hidiiim 


ta i Mi ; 




1 


•MMmlly 


H 


IK 


> ' 




qalt« iMfv 


9* 


Pf* 


> 


▼exy Ian* 


pi 


pH 


tteUttMt 



Fig. 15 • Semantic Precisioher of Featuxe Gradation /gradator/ 



So]ue examples of the use of the relational preciaioner of 
an attrihutors 

1 . how existingt functioning? /null preciaioner/ 

domo ligna - wooden house, scriban bone - to write correctly 

2. related to /whom/? re- 
re-Fetra frato - Peter’s brother 

3. Whose property? di- 
di-Petra domo - Peter’s house 
making, doing ilhat? la- 

la-libra scribano - /a manAwriting a book 
3« made, done by whom? par- 

-par-Petra libro - a book written by Peter. 

6. coming from where? de- 

de-Africa frukto - an African fruit /from Africa/ 

7. destined /suitable for/? pur- 
pur-homela vasto - women’s wear /for women/ 

So much for the syntactic structure of language. One can 
especially extend Lther semantic precisioners* The phonetic 
structure of language requires eiqperimental research by phone- 
ticians and psychologists* 

This meaoi choosing the optimum sounds and the syllable 
make-^ and then the prosodic fea,tures of the language* 

The application of the optimal language to the needs of 
scientific information opens broad perspectiTes for improving 
the work of information systems* 




i«tor and ^yntactio d«riy»tive« 
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TmVkmiMr 
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tim 
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Pig. 17. Ptnoaal Ptomob 



Tenw orexMraM for not 
<nd> 


SyalMtla tppo 


RoMlutor 


AttrifeotoroT 


BxMor 


Itotan wfeM 


Ofctoot aufcir 




Itotan 

■OlfeM 


If 


- 


4 


*• 


•a 


•« 


-a 
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s 
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ftnonUr 


(«b) 




lit 


ilnHBt<ftalto> 

ntotnlr ntto 
ttoomw 




1 

A 

£ 




> 


•enen) 




to taaSoi 
iBfOninly 


toftanSf* 




^S3T‘ 


totontoSn 


ladeptna-t 


• 


lotrttikMir, 


11 


Sm om ortn 
••totofew 






IS 

1 


.mSm 


«K 


togMMSaOMd 


II 


Mom^ 
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PMSf* 
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otn tOMUoMOly 
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Plf. 18. ?Bsil7 of lords /tmplo/ 
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Of ▲ mtumov of a CBiBAintaB siszai asd smtcmBm 

Xrtai telltrt and Olglard A* tojtaalMioa^ 

In tha papaaa atlfcttittad f or tlM tatanMitioaal Oonfattaca oa 
O anaraX M.ttoii^laa of ftesaiiti Boildias, bald in tazaaa in March 
1970, and in tha dlaonaaiona vhloh took placa during tha Oonfa«* 
ranoOf tha tasaa *ajatan of thaaauroa”, "atmetura of thoaauma* 
bava baan uaad rapaatadlj in aaraml oontazta vithoot haing pra^ 
elaaXj^ dafinad - aoraorart in diffarant aanaaa of tbaaa tOTaa. 
fa baliara that a f ocaal daf inition of tha ajatMi of thaaauma 
that would aooomt for tha ralationa infolwad would aaka it poaa^ 
ibla not on3j to naa thaaa tazaa oonaiatantly, hot also would he 
baipfol in oonatmoting thaaauri whioh partain to aalactad araaa 
of aoianea or taohnologf in acoordanoa to tha aaaa wall dafinad 
achaaot indapandantl^ of tha langnaga in whioh tha giwan tba« 
aaurua ia daaorihad. A ooaparlaon of thaaauri in two diffarant 
languagaa oould thaa he axriwad at wore aaailj and aultilanguaga 
thaaanri oould nor# oonraniantlp ha oonatmotad. 

Vba praaant propoaal can ha treated aa a.eomant and an out- 
line of a fonal daaoription of the idaaa eontalnad^hitheTnn800 
Quidalinaa tor tha latahliahnant and haralopwant of Monolingual 
Bolentifio and faohoioal fhaaauri. for XnforwationBatriewal# pa- 
per anhidtted for the Xntamationai faraaw Goaf arenca on General 
Prineiplaa of Thaaanri Mnilding. Xt goea without earing that for 
a aore apaoifio foqnanXatioo of tha ralationa dlaouaaad halowt it 
would ha naoaaaaxy to alahorata on thia topio in oonnaotion with 
a giwan apaoifio axaa of acianoo or taohnologr* 



^ hapartaant of Vozaal Xdnguia 0 iea 9 faraaw Uniwaraitr 
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Any system can be defined as an ordered sequence 

where A is a set and /for 1 < 1 < n/ ia a relation 
defined on A. 

A thesaurus can be defined as a specific system: 

/ 1 / 

where T Is a finite set of terns /nouns and noun phrases of 
a natural language/ » and the relations are one^ or 

two - place relations defined on 9. Certain relations will 
be characteristic of a apeeifle domain of science or techno- 
logy, others will be characteristic of any dcmaln for which a 
thesaurus may be constmcted. Among the latter relations we 
may mention a linear order relation corresponding to the 
alphabetical orderi partial order relations corresponding to 
concepts such as **term broader than*, *tezm generic to* /and 
their respectlwe conserseat *tess narrower then*, *ten speci- 
fic to*/, *part of whole* etc. equivalence relation, such as 
synoniaity relation between teams; other relations correspon- 
ding to concepts such as "used for*, "belongs to the domain 
of", "thing-property* , etc. 

It is worth noting that every relation deteamlnes a graph, 
which would correspond precisely to what la referred to as 
"graphical display* m the deaerlptlon of tbeaaurl. The tern 
"structure* as used In connection with thesauri can thus be 
Identified with the atrueturee of the graphs determined hy the 
corresponding relations. In this connection the teams "eystam* 
and "structure* as used In the description of thesauri would 
be Interrelated and precisely defined. Bor we have a set of 
structurea given by each theaaums described as an ordered 
system consisting of a finite set of terms and s sequence of 
relations* 

Lit us non present some examples of relations Shlch can 
be defined for sny thesaurus, and give some additional com- 
ments* 



The synoniaity relation is an equivalence relation 

defined on T» for it is: 

/a/ reflexive /x,x/ 

/2/ /b/ symmetric E^y^ /x,y/ =» R^y^ /y,x/ 

/c/ transitive: R^y^ /x,y/ A Rgyn /ytx/ H^y^ 

Rgy^ - being an equivalence relation - forms a partition 
on tbe set T, that is^ it divides the set T into non-empt>« 
dis;ioint subeets /equivalence classes/ » Ve may then choose a 
represents tive element x^ from each subset that contains 
more than one elementi and the teim chosen as a representatl*- 
ve» will then be used in the description of other relations, 
all the other elements will appear only in the relation of 
alphabetical order, and vill not be used anywhere else. 

Tbe set T will comprise all the possible terms, descrip-' 
tors and non-descriptors » Tbe set T can thus conceived of 



as 




/?/ 


T s R u R 


to be read thus: the set T of terms Is the union /logical 
sum/ of a set T> of descriptors and a set R of non-descrlp-^ 
tors; 

X> and K are finite, non-dapty disjoint sets: 

R n R « 0 

The following relations can also be defined 


/V 


P /s,b/ 


/5/ 


B /a,b/ 


/6/ 


B /a.b/ 


P/8|b/ is the 
thus: a is to be 
tion la obviously 


preference relation, and may be interpreted 
preferred to b as a descriptor* This rela^ 
Irreflezlvet asTsmetrlc, and Intransitive* IT 



a tesa is once rejected as unsuitable to be a descriptor, then 
It cannot on any other occasion be given preference to any 
other term I because if it were given preference, then it would 
have to be included in the list of descriptors in spite of being 
previously disqualifisd as a descriptor* 
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B/A|b/ iB tb# relAtioo holdltkg bttween a broader and i 
oarrover tezn: I i broeder tern tban i* l^hle relation la 
Irrafleadrai aayB&eerioi and traoaitive» and aa aueh it la an 
orderit^ relation* 9he relation holding beteean a narrower aol 
a broader tami H/a«b/i aleo llatad In the Ouldellneaioan trl* 
vlally be defined aa the oonveraa of the formeri 

/?/ N/a,b/e-> B/a,b/ <-0 B/bit/. 

R/a|b/ is the affinity relationi and is reflexive and 
asymmetric I but not transitive; although transitivity may be 
observed in certain cases* it may not be assumed to be a rule. 
!Dhis can be explained by the following example: steam and steam 
engine are related terms* and so are steam engine and combus- 
tioD engine « but steam and combustion engine preaumably are 
not related. 

The iollowing implications are assumed: 



B/a*b/ ^ E/a*b/. 

The following relation can be defined by the vell^-known 
procedure : 



This is to be read thus: a is the next broader term to b 
/a is broader than b and there is no term c such that it 
stands between a and b in the hierarchy of broadness/. 

In the definitions below it is assumed that a and £ are 
in D, and hence this fact is not marked in the formulae. 



P/a,h/ ^ //a € D/ /\ /b £ N//f 



/ 8 / 



P/a,h/ B/a*h/, 



/ 9 / 




/ 10 / 




This 3 the affinity class of a , 



/ 11 / 
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The field of at £ and all thoaa terms to which a is hroa- 
der. In the deflnlons the one-element set /"a^V muat he Inclu- 
ded since B Is not reflexivei B/ata/ acc^dlngly does not 
holdf and thus a would not be in Ita own field. This does 
not apply to /&/ as B la reflexive • 

/12/ H/«/ « b»^/«,b/V B/b,a//J . 

This is the hierarchy class of S| whichi as compared with 
F/9a/iQcludes not only those terms which ere narrower than £| 
hut also those which are broader than It* 

Usually In a hierarchy class Units most be set beyond 
which certain terms are not Included because of being either 
too general or too specific* This -can be defined as followst 

/13/ a s minimally narrow tara/s/ In E/a/^>o> 

\/ /B»/a,b/ A /* £ D/ A"l A> £ D// , 

b € H/a/ 

/1 4/ a s maximally broad tatm/a/ In B/a/CM^ 

\/ /B»/b,«/ A /• £ D/A “l/b € D// . 

b ^ H/a/ 

It is to be noted that the preference relation does not 
ensure the inclusion In D of all those terms which It nsy be 
desirable to have as deacrlptore* This is so because this re- 
lation provides for the Inclusion In B of thoaa terms which 
are preierred to soma other tsxms* It aay hspron« however, that 
a term la to be Included even though ita Inclusion does not 
Imply the exclusion from D of any term* If, for inatance,ona 
makaa a tbeaauma of chemical teas, ha will ineluda as daacrip- 
tora the names of all chsmical slemsats* Bancs, If oxyman is 
included, this fact does not bar any other term from Inclusion 
In B under the prefaraaca r^latlca* This la why it la reeom- 
aaadal that an Inclusion operation be lntroducad|to be daflnad 
ttanat 

. 5 / !/«/ /a € B/ * 
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/16/ P/a,b/ => I/a/ , 

but not coDTerselj* 

Tba Guldallnas mention the whole-part relation as a case 
of the relation holding between a broader and a narrower tem« 
It Is to be notedf however, that B holds between terms, and 
the whole-part relation holds between designate of terms /that 
relation can also hold between terms In the sense that combus- 
tion is part of combustion engine * but this obviously is not 
what the authors of the Guidelines mean/. If this relation Is 
to be retained, the following Improvement Is suggested. A re- 
lation M holding between terms Is Introduced and is tentati- 
vely defined thus: 

/ 17 / U/a,b/<—^ there exist x and 7 such that x end 7 
are objects, and x Is part of 7 , and a and 
b are names of x and 7 , respectively* 

The concepts defined or Introduced above can be used In a 
further extension of tbs formal description of tbs system under 
consideration. B.g. , 

/18/ P/s/ c a/s/ , 

/19/ The fields of a and b overlap P/a/ o P/b/ ^ 0 • 

As the main goal of the construction of thesauri la their 
uae In retrieval aystens, it would be desirable to describe a 
tbeasurus system as a oooponent of a retrieval language .An ana- 
logy suggests Itaelf with s description of s natural language 
system. The search for a recursive system of rules for genera- 
ting all and only aentences of a natural language has been In 
the centre of attempts made by a number of recent linguistic 
projects* It has appeared that the problems of grammar and 
lexioon are deeply Interconnected, and no adequate description 
of lexical entries can be adhleved without syntactical Infor- 
mation* It seems that, analogically, there Is an Interconnec- 
tion between a thesaurus system and a retrieval language* The 
retrieval language to be used for a thesaurus should be cons- 
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tructed In close relatlonslilp vlth the structures determined by 
the relations defined on the set of terns T» as discussed above* 
It Is necessary to specify what sort of questions can be aslced in 
a retrieval language » and this cannot be done otherwise than by 
relating them to ' % thesaurus system which has to be precisely 
defined for that purposs* It may be worth mentioning here that 
those who work on constructing thesauri and retrieval systems are 
In a better position than linguists » for they can establish nor- 
mative » rather than descriptive rulea/they can get rid of syno- 
nyms , h<monyms or vague terns/* 

In conclusion^ we wish to emphasise that the problem of a for- 
mal description of thesauri haa been only touched upon Inconclu- 
sively » but the matter seems worth elaborating ony and our goal 
was to draw the reader’s attention to such a possibility* 
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UXNUZBS OP THE C0N7ERSIICE 

UonAajf 23M Uaroh 1970. Afternoon Session. Chairman f J. Toman 



POm 1 0? THE QUESTIOHNAIKE: DEPINITIOK OF ^ THESAURUB 
What is meant by a thesaurus? 

Ur* T* U* A i t c h i s o ns **A thesaurus is an alphabeti- 
cal listing of concepts /i*e* descriptors/ which provides struc- 
tural and relational information about the concepts”* 

Ur* J a n s e ns For purposes of information storage and 
retrieval a thesaurus is an orderly compilation of concents 

- represented by as mai^ synonymous terms as possible in one 
or more languageSf 

- in which homox^ymous terms are specially markedf 

- in which a descrintor univocally represents a concept t and 

- in which semantic relatio ngbina between concepts are regi- 
stered* 

Underlying definitions s 

1/ Definition of Concent s 

Uental idea of material or immaterial ob;ject based on common 
characteristics which are usually formed by abstraction and fouoi 
identical* 

2/ Definition of Term s 

Name given to a concept and consisting of one or more words* 

3/ Definition of Descrintc 7t 

Univocal representative o^ a concept in a documentation sys- 
tem* The descriptor can be a ilzed tezm /^preferred teimV or 
apy other stipulated designation* 
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Mr. L 0 8 k i: "A thesaurus may he defined as follows: An 

open system covering a determined full thematic range containing 
an orderly multitude of terms, some of which are admitted as 
descriptors, showing the relations between these terms and their 
mutual dependence**. 

Ur. Rolling: "A thesaurus can be defined as a struc- 

tured vocabulary for use in information storage and retrieval 
systems**. 

Ur. Wysockl cited the definition tr<m the UNESCO 
Guidelines: "By the word thesaurus is meant a comprehensive and 
structured vocabulary of Interrelated terms some of which are 
used in the indexing and retrieval of a collection of docximenta- 
ry material pertaining to a specific domain or domains of scien- 
ce and technology**. 

Ur. Toman pointed out that a definition of an ob;)ect 
must distinguish it from other similar objects but unfortunate- 
ly In most of the cited definitions the words **clas8lfication**, 
**unlterms" or **subject headings** can be inserted instead of the 
expression **thesaurus"t and still the definition would keep its 
sense. 

The session decided to postpone this question to Wednesday or 
Thursday when other problems oi thesauri would be discussed. 

Ur. Toman proposed to build the definition /or better, 
explanation/ by expressing first the purpose of the thesaurus 
and then by describing its possible characteristlcst **A thesau- 
ms is a system of controllad terms used for characterising the 
content of documents in a storage and or retrieval system. Cooven- 
tlODsl thesauri use alphabetical listing for displaying synonymsi 
hierarchy* and other relatlonsi whereas thesauri with graphic 
maps s^ess the importance of the systematic display of terns. 
Thesauri are further characterized by their dynamics, by a weak 
hierarchy, by preferring post-coordination to pre-coordination. In 
contrast with classification systems they show related terms other 
than syoo^fms, broader and narrower tezms, but do not use nota- 
tion or categories**. 

Mr. 7 a r g a considered some requirements regarding the 
terms of a thasaurus, pointing out that they must 
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- be based on a natural languaset 

- be unambiguous I 

- constitute a decentralised collection. 

As to relations vlthla a tbesaurus, they must 

- ezprees only objective, not artificial connections 

between terms , 

- show many aspects of a connection 

- be clearly distinguished ‘ ^ 

- be reciprocal 

1.2 S t r a c t u r a 1 elements 

The participants In the session indicated several elements 
characteristic of thesauri: 

dynamics of tbs thesauri /a better word than **open-ended- 
nassV; 

- display of hierarchical! aynoqym and associative relatLons; 
display of relations: term to term, concept to concept, 
concept to term. 




1.3 Factors Influencing the 
organisation of a thesaurus: 

Messrs. Lloyd, Graves tel jn, Leshi, l&olnu, Hojilsek, and Ma- 
ixner stated these factors: 

- branch of science or technology 

- thematlcBl range and overlapping 
*• assumed degree of flnaoess 

users' reguirements 
~ grammatical rules 
structure of the file 

- methodology of the system 

« clearness or vagueness of the terminology of the branch 
language /Bungarlaa or Genaan/ 

- nature of the medium of the record /whether a bouud Index , 
a card file or a computer/. 

Mr Belling completed thle list hy citing his for- 
mn'^a for the else of a thesaurus: 
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T - size of thesaurus 
a - indexing depth 

k - redundancy factor /number of synouytns/ 
n - retrieval strategy /numbers of terms according to the 
strategy/ 

V - size of document collection 

B - response- /number of references expected by tho type of 
user/ 

IVith regard to the structure of thesauri Mr. Y a r g a sta- 
ted that thesauri must be; 

- compatible 

- easily broadened 

- easily corrected 
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ITuesday 24th Uarchf Horning Sesalon* Cbalrmanz Hr. L«Hollj.ng 



ITSH 1 OH ACEBND^: THB BOI£ OP THSSAOBI 

The main use of a thesaurus Is for termlaolog>^ control In 
Indexing and retrieval* 

Agreement was reached on the following: 

The role of a thesaurus is to ensure that indexing and retr- 
ieval of documents can be effected with mexifflum efflclencj /pre- 
cision and completeness/* 

In addition, the existence of a thesaurus covering a subject 
field tends to stabilise the teminology of this field* 

irmi 2b ON AGSNDtA: TBB DSFIKITION OP A BSSCBIFTaB 

The Chaixman analysed the definitions proposed hy Messrs* 
Leskl, Jansen, 8pang«Hanssen and 
H o I n a r, and the definition contained in the UNESCO Draft 
Guidelines* They agreed thatt 

- a descriptor Is a formalised, standardised, or controlled 
term I 

- e descriptor la to represent one /or a combination of/ con- 
cepts in an unambiguous, or unlvocal way; 

- and that descriptors can consist of symbols* 

Hr* Jansen thon^t that notations must also be con- 
sidered as descriptors! Mr* Bolling was of the opinion 
that notations such as claaaification codes or current numbers 
assigned to descorilptors for further processing cannot be conside- 
red as deaoriptors* 

Maesre* Bosenbaum and Poletylo wished 
to limit the use of doeoariLptors to systems based on concept coor- 
dination! subject headings should not be considered as descrip- 
tors* 

The assembly appeared to agree on the foUovisgt 

A deecrlptor Is an authorised and formalised tern or sjmbol 
In a tbesauroa used wiwbl'gnottsly to repreeewt ths concepts of 
documents and qhsrlss in information aysteme based on concept 
eooidlnstlon* 




ITlfiU 2c ON AdSKDAt .!IHB DSNINITlOli AND NAME OF FOBBXDDEN TERMS 

The assembly preferred the oaioe NON^SSBCRXPrOR to the pro* 
posed name ASCEXPFOE* 

By definltiOQi any thesaurus term or symbol not considered 
as a descriptor Is a non-descriptor. 

17' *2dON AGENDA: BBQUIEEMENTS TO BE FUIiFILIED B7 A DESCRXPTOB 

It was agreed that a descriptor must unambiguously charact- 
erize the concept/s/ tliat it represents t and that Its spelling 
should be subject to a number of rules* 

' Agreement was also reached on the requirement that a descrip- 
tor nmst have a reasonable** frequency of assignment and possess 
a certain combinatory power. A ^reasonable** frequency could be 
defined, according to Ur. M o 1 n a r, as a function of aver- 
age frequency of assignment* 
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Tuesday (24th March 1970. Afternoon Session » Cbalxman: J« Lloyd 



The meeting recocnrened at 3*oo p»m« to continue Its consi-> 
deration of point 2 of the revlaed agenda e "What should be de- 
manded In order to accent a term as a descriptor" . The question 
of grammatical form was placed before the group and Ur. V y* 
s 0 c h 1 read from page 5 of the UNESCO Guidelines onward. 

Ur. Spang-Banssen observed thp*^ the proposed 
UNESCO rules concerning "Dumber*^ "noun f etc. were not 
applicable to many Inflected European languages* 

Mr. V y s o c h 1 pointed out that the Guidelines were 
written In Sngllshi for Engllsh« as stated In the document. 

General opinion wss that since grammatical usage varies 
frcmi language to language « It was better for each thesaurus to 
establish Its rulesi state what they were In a preface » and stay 
absolutely consistent with them throughout* 

The meeting then moved to point 2e of the revised agenda i 
" The termlnolofflr of cross^refsrances" . 

Ur. Varga questioned the Guidelines approach to cross- 
references between various thesauri. 

Ur. Wys o chi explained that the purpose of the Guide- 
lines was not directed to the construction of any one thesaurus. 

Ur. Varga then spoke on relations within a thasauios 
and the significance of hierarchical and non-hlerarchical con- 
cepts both within and beyond a thesaurus. 

Ur. 5pang-Eansaen asked If we were discussing 
"roles" and If sOf did they belong In a thesaurus? 

The Chairman directed the discussion back to "pre- 
ferential! hierarchical! and affinitive" as proposed In the 
Guidelines. ^ 

Ur. Bolling suggested that concepts might be linked 
In an A ^ B relationship by "generic posting" as discussed In 
his paper. He preferred this procedure to posing a Boolean query 
In ratrlevml* 
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Ur» T o m A n su^eated that aa terms can be members of 
a plurality of hierarchical chains | all ralationshipa should 
be shown by tables or a thematic display* 

Hr* Jansen asread and offered a citation from B.Vtts- 
ten *n>ie Stru3ctur der aprachllchen Begriffswelt uzil ihre Dar- 
stellung in WOrterbftchen". 

The aaaaubly agreed to accept aect^on Z of the OBSSOO 
Guidelinea. 

It vaa agreed to postpone point 5 of the revised agenda 
until after formulation of a deflnltloni and discussion was 
continued on point 4 at "The methods of building thesauri" . 

. Ur* Toman proposed two thesauri building methods: 
from the bottom from the top. The first would Involve col- 
lecting terms at random from asq^xtst other thesauri, glosaariesi 
etc. - then deciding on hierarchy t synonyms and homonyms* "Bran 
the top" Implied selecting large cla8ses» subdividing them more 
specifically, and. continuing in this fashion* 

Ur. tfojslselc briefly emplained the statistical 
approach being developed at his institute in ^taigoe and referred 
the group to his printed report on the subject* 

Ur. L e a k 1 explained, with schematic diagramst the sy- 
atem being used in Warsaw, which combines a theoretical and prac- 
tical approach* 

Ur* S c h i f f explained the ZWIC indexing being used at 
the Central Technical Xdhrary in Bzidapest in order to provide 
subject specialists with information as quickly as possible* Sr- 
perts then underscore the relevant words from the ZWIC for each 
doouMDt sod return the information to the library* They are thus 
building up a file of siibjectiTely-spprowed deserlptors that will 
be used In the future for thassuzl building* 

Discussion turned "overlapping fields"i and Ifr* L a s k 1 
said that a thesaurus cannot be narrower than a diacipline* He 
graphically displayed the problem of diacipline overlap between 
chemistry I phyaicBf and biology* He streaaed that hierarchies 
may change within the area of overlap* 

Ur* Hojiiaek stated th t his statiatical method 
would ' solve overlap* 
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Ur» Gravesteljn dlsoossed the difficulty oi 
resolvlue overlap problems with disparate disciplines , or eveu 
sub-dlsclpllnes such as exist In geology. 

Ur. Spang-Hanssen. Concerning singular and 
pluxaly cf. UKB8C0 Guidelines t VXIc: 

"The use of singular versus plural form to denote different 
concepts /cf. the example WOOD versus WCX)DS In the 0SBSCO Gu- 
idelines/ is comparable to the use of qualifiers for homonyms 
/cf . the example HE4UB /SLSCTEOUAOIIBTIC/ versus /STHDCTU- 

BAL/t and since there are languages in uhlch a plural ending Is 
not always present /e.g. Swedlsh/i the use of qualifiers seems 
recommendable as a universal tooll 

"In cases where only the singular or the plural form of a 
given word Is Included In a thesaurusi the question Is reduced 
to a pure matter of expressloni and It seems recommendable that 
either singular or pliiral can be used consistently". 

Concerning the reference systems suggested by Hr. Varga 
and Mr. J a n s e ns 

"The more elaborate systems of relationships i including as- 
pects that have been suggested by Ur. Varga and Mr. Jan- 
sen eeem to me comparable to the well-2cnown roles Introdu- 
ced by Costello and others as a syntactic supplement to Taube’s 
unlterms. 

"However « It Is a new point of view to Include Indicators 
of aspects etc. In the thesaurus Itself* which ia usually re- 
garded as a semantic /not a syntactic/ tool* comparable to s 
vocabulary". 
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Wednesda7, 25th Karch "*970, Morning session* 
Chairman; J. Gravesteijn 



After approval of the minutes of the meetings on March 24th, 
the discussion continued on point 4a: methods of huUdlng thes- 
auri for overlapping fields* 

Ur* Toman mentioned the fact that thesauri are not the 
first tools of inhering atvl retrieval In the hlstozy of documen- 
tation and that many publications have been written on subject 
headings, conventional classification systems and faceted class- 
ification in the past* This knowledge and e^erlence should be 
used for the benefit of the construction of modern thesauri* 

Id conventional information systems UZX/ played an Important 
role in many countries, being the integrating factor in the net- 
work of infomation centrec* Introducing special thesauri would 
mean atomizing this network* Bach information centre In each 
country la l^uUdlng Its own thesaurus without regard to nei^bo- 
urlng flalj| 9 .j(bd to similar information centres abroad* 

On tbe hand it Is clear that, a mscbanized information 

system needs desp indexing and this demands an ordering system 
made to measure and not a universal system* We oust realize that 
the problem of the relation of special and universal ordering 
ays tarns is of capital Importanoa to this and other similar con- 
ferences* The question of relations between the thesaurus for a 
apaclal field and some superstructure, perhaps of very generic 
character, must be solved* 

Ur* Bolling ' gave an example of comDatlblllty between 
several thesauri In the same field /metallurgy/ and of compatibi- 
lity between thsasurl in different overlapping fields . /nuclear 
aciencea and metallurgy/ * 

UBEBOO Is fully aware of the problem of proliferation and 
Ur. W y 8 o c k 1 mantionsd the existence of tha two clearing 
houses In Warsaw and Cleveland dealing with thesauri* 

Ur* L a 8 k 1 stated that there must be valid reasons which 
impelled information specialists to work on other ordering sys- 
tems than olassiflcation systems* After having changed from a 
formal to a language system, ;t^: problem now to establish com- 
patibility between the existing thesauxl* 
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T:ie disadvannaEes of claBsif ication systecis were discussec. 

I'v iklojzi^e'', ?.lr. Jam^en, l'.r, Aifcchiaon and Lloyd, as 

lav as fixe nandlinc in overlapviu(^ fields ana conununication 
bctr.veen ncer and file are concerned. 

:»lr 7’ y s o 0 k i p oiiued oa^: trs!. au : oieTratjor, :?f the- 
s 2 ^ ; r i i r. a ■ v; er : s ~ ~ t er; i s on e of r, : t he- I If' IC 1ST 

rr : .'act , lu rnis ]‘ro;iect two sysrens are d istinffinshed : 
a/ 1. b ^'oador s ci:e!r:e o: ' oJ asc if j. cs ti on 

b / li.r -. ri cr ' cif f c ; / or; a c - - ;■ ; o - ■ of >: n ow 1 e d c; e . 

After C‘3cii&Gion the -irsenily coociuded on the proposal of 
!.'r. Rolling tfunt we can distiuriish tnree different levels in 
ordering systems; 

1/ A specific level ser'/ins: for iridexing and retrieval pur- 
poses. 

2/ A general level to assure co'npt tibil j ty betrw‘-.en cresaurl. 

3/ A classification system chat play.'i an orgonizii:g role. 

The factors influencing the compatibility of thesauri in 
related fields are? 

- homonyms 
** synonyms 

- relations between terms belonging to different fields 

- point of view when considering descriptors from one frsld 
or another. 

The need for cooperation in establishing thesauri in the same 
field and In overlapping fields was admitted by the assembly ^ 

I&. Mo.jzisek stated, however, that the problems \*e- 
lated to overlapping fields can only be resolved from the point 
of view of each field concerned* 

A lecture by Mrs* Bellert on the linguistic approt^ch 
to thesauri building was a welcome contribution to the di6c\:sa- 
ioQy and it clarified some of the tezms used earlier. 

Point 4a: Methodological problems in the establishmeui of mul- 
tilingual thesauri. 

The chaijnnan described th,. acti'rity of XCSU-AB. The project 
for a miatilingual thesaurus in the field of geology wc-.5 moutioned 
/English, Prench, Geimiaa, RuiiSian/. 

Seme compJ.eiaeutary Inforuation on existing multilingual the- 
sauri was fcir. R o 1 1 i n g,, The langaeges used in 
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the case of the thesaurus of the DIRR /Roads/ are English i 
French, and German, The languages are English, German and Ita- 
lian in the case of the thesaurus of the "Centro Spe rime nt ale 
Metallurgico”. The European Oommittee is also working on a mul- 
tilingual thesaurus* 

Mr* L e s k i reported that a multilingual thesaurus in 
the "Science of Science" field for Polish, German, Czech and 
Russian has been built* 

Mr. Malms ten, who had already drawn attention to 
semantic problems when comparing thesauri in different languages, 
mentioned the work done by the "Comite international des arts et 
des traditions populaires" /8 languages/* 

' According to Ur. Spang-Hanssen, erperience with multilingur^l 
thesauri in Scandinavia showed that these thesauri can only be 
established in fields with fixed tenninology* 
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Wedae8da7»25th March. Afternoon Session. Ohaiiroan; T.MJ^itchison 



F0II3T 4a ON AGENDA: METHODS 0? BUIIiDING THESAUHI 

The dlscussloa of this item was continued from the morning 
session. 

Mr. Jensen stated that the IDC thesaurus is multilin- 
gual in German and English and to a lesser extent In French. It 
is used by them in indexing papers in these languages. 

Mr. L e s 1 explained that his group used only descrip- 
tors and candidate descriptors. If a precise descriptor could 
not be found an auxiliary term In a foreign language might be 
usedy but controlled terms were required In social sciences. 

Mr. Bolling stated that there were two types of mul- 
tilingual thesauri: those which were built up simultaneously In 
each language! those In which the thesaurus in one language 
was translated Into other languages. 

Mr, Hobowskl referred to his experlt ice in diction- 
ary preparation. It was impossible to translate terms from one 
thesQurus to another: Instead one should translate conceptSf l.e. 
descriptors as classes t not according to the exact meaning of 
the words . 

Mr. Kalxner considered It advisable to translate 
whole sentences rather than word for word. One ai^t try to do 
this by machine! but tbs problem of machine translation had not 
been solved. 

Mr. Lloyd did not agree that one could not translate 
one theaaurus Into another. 

Ur. Jansen * suggested that while 90S( of the words al^t 
be translated satlsfectorllyi In 5^ to 10%*the translation would 
produce noise. 

Ur. L a s k 1 agreed with Ur Rolling's division of multi- 
lingual thesauri Into two types. Ha considered thut the same 
sebama mast be used for both thesauri If they veve to be eoaipa- 
tlble. 

Ur. Verge suggested that eoncepta were different in 
different languages and that one could translate from a languAga 
with broad eonoeptSi but not the other way round. 
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UrtHosenbaum suggeated that coQcepta were meutal 
thlQg0» tut were expressed in words. 

Sir. Spang-Hanssen considered that the confe- 
rence was confusing translation of Items of lufoxmatlon with 
translation of vocabularies* 

Mr. L 1 0 y d stated that Mr Jansen's 5% was dealt with 
in other situations by retaining the word in the form of the 
other language I l.e* by not translating It* 

Mr.Moozlsek reminded the conference that It was 
concerned with technical lnformatlon« and that definition and 
precision were required. 

Mr* M o 1 n a r considered that the conference was not ob- 
liged to reach agreement on methods t and that the problems bel- 
onged to the science specialist rather than the information 
specialist. 

J a n 8 6 n explained that although one can translate 
any sentence from Snglish into German or vice versa | the sub- 
divisions of the English term might be very different from those 
of the equivalent German term. 

Mr. LI 0 y d thought that if it were possible to trans- 
late thesauri in special subject fields» it was possible to build 
thesauri in different languages within these subject fields. 

Mr. Boiling stated that the assembly could either 
go on with the discussion for some days or reserve a time at this 
conference or lateri depending on when the UBESGO Guidelines 
dealing with multilingual thesiuri would be ready. 

Mr. W y s 0 c k i said tnat the first draft was planned 
for July 1970. 

It was agreed to leave the discussion of multilingual the- 
sauri sni to move on to t‘ne next item on tbs agenda. 

TQim 4b ON AGENDA: METHODS OF BUILDING DESCRIPTOaS 

Ur.Wyaocki drew attention to Items TI: "Selection 
of descriptors" and lit "Methods of entering" of the Guidelines. 

Ur. Hosenbaum considered that the problem had 
already been discussed and suggested that the meeting proceed 
with ovher items. 
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Mr. Melmsten siiggested that in building descrip- 
tors one should be concerned elso with the topology or struc- 
txire /neighbourhood/. 

Ur* Rolling mentioned thet descriptors were added 
and eliminated throughout the life of a thesauonis end pointed 
out that this was included in the Guidelines* 

Ur. T o m e n referred to items other than descriptors 
or nondescriptorsi cendidate descr5.pt ora and explica tire words 
added to descriptors. He also mentioned descriptors used in 
parentheses in abstracts* 

Ur. Aitchison mentioned the problem of degree of 
pre-coordination* 

Ur. Toman steted that this depended on the require- 
ments of the user. 

Ur. Rosenbeum considered that machines allow eny 
level of pre-coordination without any difficulties* 

Ur* Jansen recoounended that compound concepts should 
not be split up into less than single independent and unambig- 
uous concepts* 

The delegates agreed to provide UHE^O separately with aty 
comments on the relevant sections of the Guidelines they wished 
to make * 

POIRT 3 OR AGSKDA: CONDITIORB WHICE DS6GRXFTQRS ARD THESAURI 
IHBT FULFIL IK, ORDER TO SKBUHE THEIR IKTER-BRAKCE ABD IKFER- 
UNGUAGE C0UPATIBILIT7 

It wes agreed that these points had already been diacussed* 

FOIKT 6 OK AGENDA t CQNDITIOKB WHICE UKBT HE FULFILLED BX DSS- 
CBIFTaRS AKD THESAURI AS TOOIS FGR FURTHER DSVSLOIVSRT OF II^ 



Ur* L e s k 1 e^lalned that it might be that the thesaurus 
would only be suitable for use with given aims and at a given 
period of time and mlc^t not be adaptable for other information 
processing conditions* On .the other handf it may be a tool which 
could be further developed in roles other than retrieval, for 
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example In machine analyBls of texts » or In the synthesis and 
analysis of Infomiationt 

Ur* Bosonbaum said that thesaiiri might he -used 
tor other purposes » Including developing a science and finding 
relations between concepts* 

Ur, Malms ten suggested another use: in standard- 
ising glossaries where a thesaurus rather than a list of terms 
woxild be provided* 

Mr* Hobowski doubted the future of thesauri since 
they could not be built up by machine and were not necessary 
for the machine manipulation of large numbers of documents. 

Ur. W e e k s suggested that thesauri might be suitable 
f6r I960 but that something different would be required in the 
next few years. 

Ur. Halms ten pointed out that a number of impor- 
tant information service required no thesaurus and suggested 
that this point should bo discussed* 

Ur. Lloyd eoDsideredt however t that at the present 
stage of the computer art| with comparatively small store8«high 
input cost* etc* thesauri were necessary. 

Ur. Bobowski stiggested the use of glossaries or 
lists of keywords In fields with specialised vocabularies and 
for small collections of documents* 
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Thursday, 26th March 1970. Afternoon session. 
Chairman: H.Spang*Hanssen 



The conference turned to point 7 of the revised conference 
order I viz. organizational problems | snbquestlons /a/ and /b/ 
were treated together. 

Mr. Wysocki reported on aUE)S’C0*s activities: guide- 
lines for monolingual thesauri) to be followed during 197O by 
guidelines for multilingual thesauri | the establishment of two 
clearing houses for English and non-English thesauri respective- 
ly; the information given in the UKESCO buUetlni and) flnally» 
the efforts to cooperate with ISO in setting up regular standazds 
in this field, 

Mr. Varga coined the word **superthesaurus" to mean a 
system for ordering thesauri) andihe refered to Mr. Rolling’s 
previous remarlcs concerning levels of compatibility* 

Ur. Rolling) Ur* W y s o c Ic 1) Mr* Malms ten 
and Ur. Spang-Hanssen pointed to the more general 
nature - es opposed to a thesaurus-like nature - of an ordering 
system or a classification for thesauri* 

Mr. Toman stressed the importance of common facets in 
ordering systems for this purpose) and Ur. L e s k i suppor- 
ted this view by pointing to the need of a system for common 
face 08 and auzillery tezms to deal with >*g* geograp2^» 

As a conclusion concerning organizational nroblems the con- 
ference recommeiaded support for UNESCO’s existing and planned 
activities as regards information about the building of thesauri 
and about existing thesauri* 

The conference then returned to the previously postponed 
Point 3 1 Constructional problems) in particular Point 3ei Which 
elements should be Included in a thesaurus? 

There was agreement on th^ understanding of "eleaentB” ) as 
well as various possible presentations of a thesaurus /cf * 

Ul^CO Guidelines) Third Draft, as the alementS) viz.descrlp- 
torSf non-descriptors I and re3 ational indications) to be in- 
cluded in various presentatioi^* 

Ur* J a n s e n pointed to tbs acheme of elements given 
on p. 4 of his conference paper) and he gave priority to the 
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thematic group as an element In comparison wibl: al-pha'peticax 
ligtlofi. 

Bir, L e s k i, referring to p. 4 of his conference papori 
gave priority to the scheme /corresponding to the facet group - 
ing In the UKBSCO guidelines/ in comparison with Graphic disp - 
lay, and In turn priority to this in comparison with alphabet - 
ical listing * 

Mr* Toman and Mr* Malmsten advocated the 
inclusion of all kinds of references in the alphabetical list- 
ing* at least for certain practical uses , ??7hile Mr. Roll- 
ing pointed to certain Inconveniences in burdenlnp; an 
alphabetical list with* among other things* a great number of 
related terms# 

The indication of frequency of descriptors was mentioned a s 
a possible element of a thesaurus * but Mr. S c h i f f found 
this to be of little valuei e*g* in indexing* 

Mr* Mojzlaek pointed to the usefulness of stating 
the date of the introduction of a ujw descriptor or of elimina- 
ting an absolete one to solve the problem of updating* a thes- 
aurus* 

Mr^Wyaocki agreed on this point aud referred to 
the UlilESGO guidelines sect. IIIl. He found that what has been 
said about presentation was in essential agreement with the 
Quidellnea* 

Ur* Toms n and others would prefer another designation 
than "facet groupiag" for the way of presentation dealt with in 
section XI /b/ of the Guidelines* 

!Sr* Graveateijn and Mr* Malmsten ob- 
ject to the concept of listing as used exclusively for alphabe- 
tic 11s ta* 

Mr.Mojziaek pointed to the value of an explanatory 
introduction as an element of spy tbeaaurusf cf* sect* II of the 
HNESCO Guidelines* 

Mr* Aitchison expressed doubt as to the relevance 
of the remarks about computer atoratai found in sect* XI /d/ of 
the Guidelines* 

Afl a conclvislon concerning elements of c. thesaurus * the cc • 
ference recommended an explicit statement on the priority o^ 
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ays tamo tic dlspleys 'by repbraslog tlie first part of sect. XI 
of the UHESCO Guidellaes^ Third Draft in the following way: 

"XI. Rceeentatioa of Thesaurus 

It is recommended that a thesaurus be presented in one or 
more systematic displays and in an alphabetical listing. 

Secondly , the conference recommends a less negative for* 
mulation as regards the introduction of structure in an alpha- 
betical list. 

Thirdly , the conference recommends that the wey of presen- 
tation designated by "facet grouping** is designated also hy 
other tezas in common use for this way of presentation". 




Friday, 2?th March 1970. Moraine Session. Chainnan: D.C.Weelcs 



Mr. Spang-Hanssen predented a summary of the previous sess- 
ion, of which be was Chairman. 

The initial topic of the session was the evaluation of the- 
sauri, added to the agenda at Mr. Aitchison'e request. The 
Chairman specified thesaurus evaluation as having these implica- 
tions: 

1 / xiix^ efficiency of its function in a system - its techni- 
cal qualitiesi 

2/ the effec Give ness , which is a measure of the degree to 
which it is capable of fulfilling the user’s needs and of describ- 
ing the user’s problem. When evaluation included comparison 
of two or more thesauri, then it is necessary to distinguish 
between thesauri that define a given domain of knowledge, and . 
those which are specifically system-related. In the first ins- 
tance, comparison cannot be based on substantive considerations 
since their content is different. In the second instance, their 
content may be similar, but their system applications may dif- 
fer. 

Mr. Aitchison explained his interpretation of the 
matter by stating that he perceived bwo questions: 1/ By what 
means can we compare the perfozmance of a system employing a 
thesaurus and one that does not? is there a way of proving the 
benefits in performance obtained by one of these alternatives? 
£val\iation implies a comparison of different forms of thesauri. 

The Chairman added that when systems with thesauri are comp- 
ared with systems having none /f r ee-uext/ , we must recognise 
that input and processing are quite different f that the set of 
rules is also dissimilar and that assessment must be made by 
keeping the implications of those differences clearly in mind. 

Mr. Toman suggested that a useful comparison mi^t be 
made between systems employing other means, such as the Cranfleld 
experiment which compared various techniques. He considered the 
problem to ^ntain two distinct elements: 

1/ Eval .ation of other ordering systems /this would permit 
evaluation of the thesaurus as a tool/ and 
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2/ Ueas\i 2 *eiiient of effectiveness /assessment of the opera- 
tional effectiveness In a system/* 

B£r. Ma 1ms t en added the suggestion of the anti- the- 
saurus concept* Scandinavian systems received feed-back ftom 
Industrial organizations where abstracts were used* 

Ur. Uo;) zldek stated that all systems function on 
a set of roles which determine how processing Is accomplished* 
It Is necessary to analyze the file* to evaluate the retrieval 
requests t to make comparisons among „these requests and thvis to 
evaluate the amount of noise In an operating system* He noted 
the difference between an Indexing language and the language 
of retrieval. To reconcile their differences It Is necessary to 
develop a syota^atlc or meta-language* The existing differenco 
- one in which a system language loses Its grammar - is a prin- 
cipal reason for the presence of noise* Only by correlating In- 
put and output languages can this problem be solved*The_^ai^an 
stressed the gap between the efforts that have so nearly resol- 
ved problems of c^miantlcs and the minor progress toward solu- 
tions for the serious syntactic difficulties that Impede system 
effectiveness* 

!Ir* U a 1 X n e r emphasized the importance of Mr Mojzl- 
§?k*8 remarks* He added that thesauri can only be evaluated as 
integral parts of a retrieval language; l.'hat when a thesaurus 
states Its language in a paradlgnatlc way and the two are then 
compared t we can than observe their levels of efficiency* 

o j z 1 B 6 k: the rules for making a thesaurus when 

applied to making a system dictionary are all system-relatedi 
producing a new language which Is not that of the sotirce /doc\i- 
ment/t nor of the users* These languages need to develop a gram- 
mar which will bring all elements Into corraapondence* 

Mr* Bolling: A^opoe the Cranfleld experiment .it Is 
essential to remember that this effort was a comparison of or- 
dering systems, all placed In the same environment as opposed 
to the individual, natural environments* Comparison of results 
Is, therefore, difficult because evaluation la usually applied 
to systeoB as entities* Hr* Rolling urged continual Internal 
evaluation of a thesaurus based on failures of raoall and pre- 
oislen* Be bad found that a large percentage of failures were 
osused inpreoislon in tbe tbesauros* 
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Mr* Altchlson remiuded the group that Craufleld 
Included two different studies. The second was able to show 
significant differences In coordinate Indexing* 

BIr* L 1 o 7 d stated that any test or measurement was 
not merely a test of the thesauri^s but was In fact a system 
test. He proposed that economics was a critical factory and 
In a system without a thesaurus It Is possible to achleTe high 
recall and relevance but only at a very high cost* 

Summing up the topic y Mr* Lloyd offered the view 
that systems employing thesauri should be constantly evaluated 
for effectiveness and economy and the thesaurus updated accor- 
dingly. 

pei'lnltion of a thesaurus: 

It was suggested by Mr. !? o m a n and agreed upon that 
the assembly offer a description rather than a definition /to 
avoid the strictures of a formal definition/ Mr. Molnar 
offered a formal description of a thesaurus developed from a 
synthesis of conference agreements. These Included matters of 
/1/ Order /systematic and alphabetic structtire/ /2/ terms 
/aescriptors and non-aescrlptorsy regretting the decision to 
abandon the designation ASCRIFTOH/; and /5/ Interrelations 
/preferential, hierarchical and affinitive/. 

A general dlscussica yielded agreement that In reference 
to structure, systematic orier was bo take precedence over 
alphabetic: Mr Toman reverted to his statement at the first 
meeting, when be observed that most ordering systems could be 
described by these same qualities. 

Four members of the conference offered written descrip- 
tions which are Included here: 

1/ Mr. liolnai 

2/ Mr. Rolling /These are attached/ 

5/ :ilr. Toman 
V Mr. Poletylo 

The Chairman synthesized the proposed descriptions so as 
to meet these requirements 

1/ That we state the nature of a thesaurus - to what class 
of resources it belongs. 

2/ The contents are named* 
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5/ Its qualities are stated 
/It covers a domain/ 

/It Is controlled, d^roamlc/ 

/It Is arranged in systematic order/ 

V Its principal use: 

/coordinate indexing systems/ 

The combined description i9as accepted by the conference* 



Ur» Poletylo 

When «e say that a thesaurus Is an Indexing tool, we must 
add what we think about coordinate Indexing* Therefore the de- 
finition of a thesaurus may be as follows s 

A thesaurus Is the controlled vocabulary of a coordinate- 
index! ng-languaee * 

Consequently: 

A descriptor is the preferred term In the controlled voca- 
bulary of a coordlnate-lndexing-language* 

A non-descriptor is a forbidden term In the controlled vo- 
cabulary of a coordinate-indexing- language* 

Without adding coordinate", the definition will be too 
broad* The term "coordinate" Is a feature that distinguishes a 
descriptor language from all Indexing languages, In contradis- 
tinction to precoordlnate-lndexing-languages as classification 
and subject-heading languages* 

The term "controlled" distinguishes a descriptor language 
from coordlnate-indexlng-languages asid from a Uniterm language 
^hlch Is uncontrolled/* 

Ur* I* U o 1 n a r 

After a Vday conference, after many friendly and halpful 
discussions, after many agreements, spproximate agreements and 
disagreements too, we have not reached an agreed definition of 
our fundamental topic, although discussions have taken more 
than 20 hours* 

Could such a meeting of experts, practical creators of In- 
formation systems and thesauri, not find a name for their com- 
mon heautlfxil son? 
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In contrast to thls» ever/ participant In the conference 
has his own definitions for thesauri* It Is a pity that these 
definitions reflect as many points of view as the number of 
participants* 

Wellt In this difficult situation, In this uncomfortable 
richness of definitions we have the duty to find the common 
characteristics of them all. This could be the first approxima* 
tlon of the problem* Forgive me if I begin with a rather gene- 
ral definition of my own* This Is as follows: 

1. A thesaurus Is in general an ordered collection of terms 
which also Includes their Inter-relation 

This Includes no Information on either the principle of or- 
dering, or the nature of terms, or the branches of science, or 
the types and nature of Inter-relations between the terms Inclu- 
ded, nor the task of the thesaurus* This definition Is of such 
a general validity that everyone can agree with It* Everything 
It Includes Is Inevitable, a "sine qua non" for a thesaurus*But 
this determlnaulon does not Include some elements which must be 
further Investigated* 

The definition In this form Is able to cover all thesauri 
from Roget*s to the Suratom thesaurus* 

I hope we can be agreed on the basis of rolnistum programme* 

If so, we are able to move on In the direction of deeper deta- 
ils, leading to a more detailed definition* 
a/ The first element In my definition which requires further 
analysis Is the expression ordered * This word Includes the 
problems of the structure of a thesaurus* 

We have already found agreement In relation to this expres- 
alon*Th6 material of a thesaurus can be ordered alphabeti- 
cally or systematloally, but It Is advisable to prepare both 
an alphabetical and a systematloal ordering* 

By systematical ordering we understand the hierarchical or 
graphic display of terns* 

b/ The second element which must be Investigated Is the expres- 
sion terns * 

We already have a connon oplnlc n on this question too * Terns 
are defined as consisting of dc icriptors and non-desirlptors* 
I really think It regret^ble tnat the tea ascrlnto;? Is 



- 157 - 



dead In ”statu nascendl”/. But ve are In agreement, I see, 
on the problem of auxUlazy terms, too. These seem to be of 
a descriptor character. 

/The definition of descriptor Is not Included In the defl-> 
nltlon of a thesaurvts. This Is for the sake of brevity/, 
c/ The third element to be detailed is the expression lnte£- 
relatlons. We tend to egree on the recommendations of the 
UKEiSCO Guidelines and so we have three groups for lnter«-re<- 
latlonshlps of terms. 

These are 

1. preferential 

2. hlercrohlcal and 

3* affinitive 

relations. The common name of 1. and 3* le "non-^hlerarchi-* 
cal lnter-»relatlonshlps”. 

As a consequence of these agreements-lDf-detall, the second 
approximation of the definition could be as follows} 

2. A thesaurus is In general an alphabetically or/and svs« » 
tematlcal^ ordered collection of descrlntora and non*descrlT>- 
tora having hierarchical and/or non«>hierarchical lnter«>rela«» 
tlonshlps . 

This second definition is somewhat more precise and of a 
more specialized character. 

There Is an unnecessary element In the definition. The third 
approximation is directed to words the elimination of this ele- 
ment, finding a better corresponding word Instead. This unneces- 
sary element is the expression In general . 

This elimination can be realized only by determining the 
precise purpose of the thesaurus* 

In the last three days we have talked e lot about the va- 
rious possibilities and directions of the further development 
of thesa\irl. We ^ve verified the sclence^avelopmental func- 
tion, Its solenee-organizlng power, we mentioned the linguistic 
possibilities, eto. But we gave the greatest part of our dis- 
cussions to the use of thesauz l In Indexing work, Information 
storage and retrieval, as well as to the standardization of In- 
formation queries and to the further development of solentiflo 
Information. 



142 



er|c 



- 158 - 

It is well known that there is a general and essential 
correlation between function and structure in the living world 
as well as in dead material. And this correlation is valid in 
the case of theaaxirlt too* 

A thesaurus, the task of which is to introduce someone 
into the world of science, must have a given structure which 
makes this possible. This structure can be a hierarchical one. 
Another thesaurus intended as an effective tool for Indexing 
documents must be of alphabetically arranged structure, because 
this is the only way an indexer can do hie job. 

On the basis of this explanation it is possible to realize 
the third approximation for a definition of thesauri, but only 
when postulating its specific task* 

We are information specialists. Thesauri are very impor- 
tant tools for our fundamental activity. Therefore we must 
investigate oiUy one side of the whole problem of theaauri,and 
this is the side of information activity* 

After this statmaent I feel it possible to define a thesa- 
urus in a more exact foxmi 

A thesa^ fuB as a tool of inf ormation avstems is ah *1^ 
phabetlca*n , 7 avstamaticalLv ordere d collection ofLdeacriPr 

torn and non-deacriPtora h aving hierarchical and/or noj^hie- 
rarchlcal inter -relationabipam 

This definition seema to correspond to our information 
tasks, though it includes no atatanents in connection with tra- 
ditional or computerised uses. But it would be incorrect be- 
cause of tte relatively great number of existing traditional sy- 
stems* 

Other thesauri for other pxjrposea can be otherwise defined. 

Itr. Bolling 

A thesaurus is a controlled but dynamic vocabulary of se- 
mantically related teima offering comprehensive coverage of a 
specific domain of knowledge* 

Its main use Is in the subject characterisation of docu- 
ments ami queries in information storage and retrieval systcme 
baaed on concept coordination* 
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Its principal elements are descriptors y non-descriptors « 
terms and relationship indicators* 

It generally comprises one or more systematic displays 
and one or more alphabetical listings* 

Ur. o m a n 

A thesatunis is a system of controlled terms lased for cha- 
racterising the content of docnmenJs /inforioation/ in storage 
and/or retrieval systems* It is characterised in its dynamics 
by a weah hierarchy! by preferring postcoordination to preco- 
ordination* In contrast to clessiflcatlon systems , it displays 
affinitive terms but does not use notation and categories* 
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Prldaji 27th Uaroli* Final Sea8loi^11«4^ a«m« Cbairmaa: K«Leskl 



The seseiOQ was devoted to the problem of fomulatlog con* 
cliislons embracing the results of the whole conference* The 
following conclusions were taken: 

X* A thaaaurus Is a lexical tool of information retrieval 
systems* It consists of a controlled but dynamic vocabulary of 
seiaentlcally related terms* This vocabulary* which comprehen«» 
sively covers a specific domain of knowledge » is a systematic 
cai:iy and alphabetical^ ordered collection of descriptors 1 non- 
descriptors /auxlliaxy terms/ as well as Indicators of their 
relationships both hierarchical and non-hi63:archlcal. 

Unlike classification systems a thesaurus does not neces- 
sarily use Qotatloxu and categories* 

Its main use Is In the subject characterization of documents 
/information/ and queries in systems based on concept coordina- 
tion. 

TThen discussing the question of the display of descriptors f 
the Importance of the systematic display | whether in thematic • 
hierarchical or graphical form^ or all of them together! was 
stressed* It was recommended that the UK3SC0 Guidelines put the 
systematic display before the alphabetic in their wording* 

II« The main role of a thesaurus is to ensure that indexing 
and retrieval of documents can be affected with a maximum of 
precision and completeness* 

In addition, the existence of a thesaurus covering a subject 
flald tends to stabilize the terminology of this field* 

In partlcTilar: 

- Thesaurus work leads to the formulation of more precise 
definitions of inconsistently-used terms* 

- New terminology is properly defined and given its prefer- 
red place In the semantic structure of the vocabulary. 

- Edit could use thesauri in advising a ithors to use 
con ct tarminology in their publications /particularly 
in titles, which are used for EWIC and other indexes/* 
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Editors could use tbesaurl for prellmiiiar 7 loggliie of 
articles or abstracts t tlrus facilitating the work of 
subsequent scanners and Indexers* 

- Dictionaries should Include mention of preferred termi- 
nology In the mador thesauri* 

- Tbesaurl constitute the starting point for elaboration 
of the more complex Hats of words used In free-text 
processing* 

Systems employing tbesaiarl should be continually scrutiniz- 
ed and evaluated In terms of recall ^ relevance » and econo- 
mics t and the tlvesaurl should be updated accordingly* 

III* Three types of factors Influence the elaboration of 
thesauri: 

1* Factors related to subject area and language 

- volume of llteratiire to be covered 

- language distribution of this literature 

- redundancy /or lack of precision/ of terminology 

- overlapping with fields already covered 

2* Factors related to the user population 

- degree of precision required 

- degree of completeneas required 

- response volume required 

3* Factors related to system methodology 

- equipment iwed /degree of mechanization/ 

- storage medli used 

- file organization 

- search logic 

IV* The general opinion was that It was impossible to build 
a universal thesaurus In the sense of the detailed UDC univer- 
sal systemy but that if the need Is felt for a superstructure 
covering the existing special thesauri It should be a very ge- 
neric classification scheme la the sense of the recommendation 
made during the tJEISIST control committee meeting In December 
1969* It was suggested to the USE8C0 delegate to provldStWhen 
constructing a general schemer for lists of auxiliary terms and 
common facets from which the special thesauri could eventually 
draw the terms needed for the description of the form of docu- 
ments and such terms which are common to many thesauri* 

■' j ‘ 
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In ordir to obtain a worldwide information system, three 
levels of ordering systems are necessary: 

1/ Mission- or discipline-oriented thesauri serving on a 
specific level for indexing and retrieval purposes 
2/ General thesauri assuring compatibility between speci- 
fic thesauri in related fields 
5/ An ordering sjrstem displaying che different branches 
of human knowledge at a very general level. 

V. The problems of multilingual thesauri were discussed 
exhaustively. It was agreed that there are two approaches to 
the building of multilingual thesauri: simultaneous construc- 
tion of thesauri in different languages, and translation of a 
thesaurus in one language into thesauri in other languages. 
Similarly, there are two methods of use: simultaneous tise of 
the thesaurus by two or more language groups, and transfer of 
a data bLse from one language to another. It was recommended 
that UKSSCO should take account of the deliberations of the 
meeting in preparing their guidelines for multilingual thesa- 
uri. 

The conference agreed in principle with the UMESCO Guide- 
lines, Section VI - "Selection of descriptors" and Section H - 
"Methods of entering descriptors in the Thesaurus" and agreed 
to provide UKESCO with their own comments. 



GrUlDSLTNES FOR TEE ESTABLISHIIEIIT ABD BjBVXLQFUBRT 
OF UONOLIBGUAL SdERTIPZC IBD fSCERICAL THESAtlEI 
FOR IBKfilUIIOH BSCRlSm. 

United Rations Educational, 
Scientific and Coltural Oreenlsatlon^ 



Explanatory statement 

At a time when the eatabUstament of c World Science Infor- 
mation System Is being seriously proposed^ It Is advisable to 
rememember that the viability of apy world system depends first 
and foremost on compatibility betweeu Its component parts. 

These guidelines for the establishment and development of 
monolingual scientific and technical thesauri for Information 
retrieval are published In an attempt to lay the basis for 
compatibility, both at present and In the future, of thesauri 
that are being elaborated simultaneously In most of the discip- 
lines of science, basic as well as applied* 

They are, therefore, directed to all those who In the cour- 
se of their career come into contact with thesauri, either as 
users or as thesaurus compilers* 

The first draft of the guideiines was prepared ly the Unes- 

co Secretariat* The third draft and this voralon were subaequ- 

/ 

ently reviewed and studied by eminent and competent Individuals 
and organisations and the relevant additions or corrections 

^ SC/MD/20- Paris, 6 July 1970 -Original: English 

These guidelines were specifically drafted for the English 
language and when applied to monolingual thesauri In other 
languages they should .be modified bo talM Into considera- 
tion the attributes and uses of that particular language* 

** Joint Unesco-ICSU study on the feasibility of a World S^n- 
ce Information System /UNI81SI/ Knal Report* Unesco, Paris, 
1970* 
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7tex(9 made. Thus, these guidelines were presented to, and dis- 
cussed by, the International Conference on the General ^Incip- 
les of Thesaurus Building In Warsaw, Uarch 197O. The proposed 
changes ara Included In this version / 1 /. May our collaborators 
accept anonymity along with our gratitude* 

Fourteen guidelines are presented: the first four are of a 
general nature, the following seven deal with the establishment 
of thesauri, and the final three relate to the development of 
thesauri. Examples, where appropriate, are given on the right- 
hand margin of the text.^y Ithe word "thesaurus**, as used in the 
present text, la meant a controlled and dynamic vocabulary of 
semantically and generlcally related terms which comprehensively 
coVwrs a specific domain of knowledge* This vocabulary Is a sy- 
stematical and/or alphabetical collection of descriptors, non- 
descriptors /auxiliary terms/ as well as Indicators of their re- 
lationships* Unlike classification schemes, the vocabulary does 
not necessarily use notations and categorlea* 

A descriptor Is an authorized and formullzed term or aymbol 
In a thesaimis, used to represent unambiguously the concepts of 
documents and queries /2/* 

Thesauri should be based on concepts and relationships which 
are Internationally acceptable* Original and translated thesauri 
already exist In most of the major vehicular languages used In 
science and technology today* It is rare that any particular 
word can be translated unlvocally Into another language without 
losing seme shade of meaning In the process, but It is hoped that 
the application of these guidelines to monolingual thesauri will 
diminish the enormous difficulties encounter >d In the establish- 
ment of thesauri In different languagea* These guidelines were 
originally drafted In English and when applied to monolingual 
thesauri In other languages, they should be modified to take In- 
to consideration the attributes and uses of that particular lan- 
guage /e*g* number of descriptors, YIII b/* 

Thesauri can be used In many ways, and the structure of a 
thesauruB ie Intiaately related to Its proposed utilisation* A 
thesaurus can be used merely as a word association list for hel- 
ping Indexers, or It can be conside *ed as a transformation of the 
natural language into the Infomatl n language /V« 
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Modem techniquea In InformatlOD science are nearly all 
based on the use of electronic computers and It Is In this 
connexion that the use of thesauri Is rapidly proliferating* It 
Is this rapid proliferation vhlch has brought the need for In- 
ternational guidelines to light and It waa for this reason, too, 
that Uneaco recently encouraged /helping In the establlsbinent 
of one/ the work of two clearing-houses dealing with thesauri* 
These clearing-houses are located at the Bibliographic Systems 
Center, School of Library Science, Case Western Reserve Univer- 
sity, Cleveland, Ohio 44 106, United States of America, and at 
the Centrally Instytut Informacjl Raukowo-Technlcznej 1 Bkono- 
mlcznej, Al* NlepodlegloAcl 188, Warsaw, Poland for English and 
languages other than English resi^ctlvely* 



General 

I. Advisability of a pilot run 

Before establishing a thesaurus on a definitive basis it Is 
strongly recommended that a practical test, baaed on a restric- 
ted number of documents dealing with a small area of the domains 
to be VLltimately covered, be carried out* This pilot run, based 
on tentatively structured terms, should show up the more adequ- 
ate methods of descriptor selection and thesaurus display ap- 
plicable to the case under consideration* The results of this 
teat shoiQd be critically commented upon by as maqy people as 
feasible. Including Information scientists and Indexera as well 
as subject specialists and users* 

II* Necessity of a descriptive In- 
troduction to the thesaurus 

No thesaurus should be presented without a comprehensive In- 
troduction which states clearly the purpose and structure of the 
thesaiirus, and the domains covered by It* The rules followed in 
Its establishment should be p ?aaented In a condensed form*Thl8 
Is particularly true of the me jhods and sources used In the 
selection, form and avoidance of ambiguity of the descriptors 
/aee VI, VII, VIII/ *The method of presenting the thesaurus as well as 
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the rules for elphahetization and punctuation, whenever appli- 
cable, should be explicitly stated. 

Most important of all, the rules for using the thesaurus 
and its limits of applicability should be elucidated and illu- 
strated by means of examples, where appropriate. 

Users should be Invited to contribute comments and sugges- 
tions for the improvement of the thesaurus, and to inscribe 
themselves on the mailing list for future editions of or addj.- 
tions to the thesaurus* The proposed system for developing and 
up-dating the thesaurus should be explainedf the date of the 
present, and estimated appearance of future, editions or addi- 
tions to the thesaurus should be given. 

The total number of descriptors, non-descriptors /V, iden- 
tifiers, /see VI/, hierarchical chains /see X/b// and related 
concepts /see X /c// should be itemized. 

III. Necessity of indexes 

Every thesaurus, regardless of its mode of presentation 
/see XI/ should contain an alphabetical union list of each in- 
dividual unstructured tezm /5/ whether issued separately as a 
supplement or together with the main thesaurus as an annex. Per- 
mutation indexes may also be used. 

It may be useful in the case of multidisciplinary thesauri 
to present, in addition, indexes in which the descriptors are 
grouped by discipline. 

IV. Notification of intent 

The appropriate clearing-house /see above/ should be noti- 
fied of the Intention to construct; a thesaurus, aa well as when 
the thesaurus is first published or disseminated. This informa- 
tion ahoxild be channelled through the national organization 
dealing with thesauri, where and when such an entity exiata. 

The same applies for further editions. If at all poaaible, 
a copy of the thesaurus, complete with the introduction and 
indezea should he sent to the clearing-house in queation. The 
fact of notification should be mentioned in the introduction. 
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S8ta'bll8hB«nt 

7* Check frith elearlng- 
houae to avoid dupll. 
e a t 1 o D 

Before oooDmenclng work on the estahllfibisent 
of the theaauru8» It la advisable to aacertaln 
whether others covering that particular domain 
or a nel^bourlng one are available. 

This la best done bj addressing a query to 
the two clearing-houses mentioned ebove. It may 
be found advisable to go ahead with the compila- 
tion of a particular thesaurus In spite of the 
existence of a similar one. In this case the re- 
asons for proceeding and the differonoea with 
the earlier thesaurus should be clearly stated 
In the Introduction* 

VI* Selection of descrip- 
tors 

The selection of descriptors should begin 
only after the general structure of the thesa- 
urus baa been agreed upon. It should be carried 
out« preferably » by people who have both a good 
knowledge of the subject to be traatedf and 
previous experience In Indexing or classifica- 
tion. The use of Internaticnally recruited te- 
ams for the construction of thesauri is to be 
encouraged since it widens the cumulative lin- 
guistic experience which goee Into the building 
of the thesaurus* The methods of selecting des- 
criptors vary according to the proposed structu- 
re of the thesaurus /alphabetical listing , ays- 
tamatlcal listing /6/, graphic display /6/t 
see XI/ f the purpose for which the tkeaaimis 
will be used /e*g. for manual or mechanical 
retrlevali only for Indaxlng, or as a aacondas^ 
tool/ and the backgroiukd to the project /gra- 



dual bxiild-up to mecbanlcal processlag, Intro- 
duction of a nevj dcanaln e»g. Interdisciplinary 
areas for wblch no previous classification 
schemes existed, existence of well-defined group 
of users and subject specialists, extensive 
literature/. 

Descriptors, In general, consist of terms 
related to discrete concepts encountered In 
the subject field under consideration and In 
pertinent marginal areas* A more specific class 
of thesaurus terms /?/ known as "Identifiers" 
may sometimes be used* 

Descriptors should succinctly summarize 
conc^:pts In as few words as possible, preferably 
one. Grammatical connexions such as preposi- 
tions or articles should be avoided whenever 
possible . 

Identifiers constitute a special type of 
thesaurus terms /8/ .valch are not reciprocally 
cross-referenced /see XI/ and which serve the 
purpose of providing additional Indexing depth. 
For Instance, identifiers might Include Indi- 
vidual trade names, geographical locations, 
eqiiipment, nomenclature, code names etc* 

Since they are not reciprocally cross-are- 
ferenced. Identifiers need not necessarily 
appear in the thesaurus display, but may be 
listed separately. In addition to appearing 
In the Union List /see III above/* 

Four distinct steps Intervene in the se- 
lection of descriptors: collection, verifica- 
tion, evaluation, and choice. 



Acoustical 

Holography 

Brain 

Besearch 



IBELAISD 
IfT DUBLIN 
DUBLIN 



/a/ Collection 

It is ^ost impossible to make a compre- 
hensive c Ilection of candidate deacrlptors 
by ticking of an alphabetical list* By en- 
visaging descriptors in groups, thought asso- 
ciations between them give rise to many candi- 



- 169 - 

dates » Potential \isers end subject speclaUsts 
as well as Internationally or nationally stan- 
dardized technical dictionaries should be 
consulted; texins should be chosen from the cu- 
rrent literature; existing word lists or class- 
ification achemea should be culled and may be 
expanded or compreBsed appropriately .Scientific 
and technical dictionaries and glossaries » both 
multilingual snd monolingual constitute a pro- 
lific source of descriptors /see page 182/. 

/b/ Verification 

With all methods of assembly, the authenti- 
city of the selected descriptors should be 
verified by consulting dictionaries, other In- 
dexing or standardized vocabularies, current 
usage In the literature and aspe dally the opi- 
nion of subject specialist. Obsolete teimin- 
ology should not be Included, or If so only as 
forbidden terms /see X /a/ below/. 

One of the more appealing attributes of a 
thesaurus is Its ability to assimilate iamed- 
lejely the neologisms and special jargon that 
proliferate In expanding fields of .basic and 
applied research. Pull advantage should be taken * 
of this facility in coBbination with the use 
of scope notes /see VII /c/ belcw/ and cross- 
references. Special ease should be taken with 
terms whose connote tlons have changed with the 
passage of time, or whose meaning changes from 
country to country. If overlapping terms have 
to be Included the appropriate cross-reference BXILION/10 SIP 9/ 
/see X below/ should be employed* BZLLION/10 EXP 12/ 

/c/ Evaluation 

In evaluating the utility of candidate 
descriptors, reference should be made to thalrx 
1. frequency as encountered In the literature 
or In the existing stocks of lnfozmatlon/9/; 
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2. anticipated incidence in retrieval inquiries} 

3* relationship to descriptors already accepted; 

4. appropriateness and authenticity as current termino- 
logy in the diS(.ipline concerned; 3* effectiveness and 
expediency in connoting and denoting the particular con- 
cept. tfone of these factors should be considered inde- 
pendently and particular attention should be paid to areas 
of peripheral interest where the exhaustivity and speci- 
ficity required of the descriptors are not the same as 
for the core subject. 

/d/ Choice 

In all cases f descriptors should be selected for in- 
clusion in the thesaurus on the basis of their estimated 
effectiveness for retrieval purposes and their aeasureahle 
significance in the material to be indexed. 

7II. Methods of avoiding 
ambigui ty 

In compiling a thesaurus, difficulties are encounter- 
ed v;ith descriptors which have more than one accepted 
meaning or whose meaning in a given context is different to 
that commonly encountered. In such cases the required 
meaning may be brought out hy the use of the following me- 
thods: 

/a/ Compound expressions 

Although descriptors are preferably self-contalned| 
single term concepts, the use of modifying expressions to 
make clear the different meanings associated with a given 
term is necessary in certalu cases. For the method of en- T.ATOiffP 
tering the resulting compound expression, /see IZ/a/ below/. HEAT 
/b/ Qualifiers for homonyms 

SSAllIB 

The various forms of homonyms may be dlstiogolshed by /ELBC- 
the use of qualifying expressions placed between pare]>- TBOMAG- 

theses immediately after the homopym. Other homopymB shou^ 
not be used as parenthetic qualifiers. /STRUC- 

■ TORAL/ 

/c/ Scope notes 

A scope note is a brief explaoation which may 
ac comps by the descriptor in the thesaurus display. 
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l>ut does not form part of tbe descrlptor.It Indi-* 
oates tbe way In which the descriptor should be 
\ased{ It need not necessarily consist of a die* 
tionary definition* Scope notes are sooetljaes 
\ased to restrict the xisage of e descriptor* They 
should always be used in conhexion with abbrewia- 
tions and acronyms /see VII /d/ below/* 

It is recommended that either /a/ and /c/ or 
/b/ and /c/ above be. used together in a single 
thesaurus* Uethods /a/ snd /b/ should be mutual* 
ly e::clusive and never be used in one and the 
same thesaurus* 

VIII* Form of descriptors 
/a/ Word form. 

Once it has been decided to include a given 
term in the thesaurus « care should be taken to 
ensure that the word foa used adequately con* 
veys the exact meaning intended* 

/i/ Snelllng t tbe most widely accepted 
spelling of the word should be used* Cases 
arise t particularly in Sngllshf due to varying 
xasage on different sides of the Atlantic tWhere 
more than one spelling of a word is acceptedfin 
which case both forms of the word should , be Inc lu* 
ded in the thesaurus* In these cases the prefe-» 
rential crosB*reference should be emplcyed /see 
X /a/ below/* Alternativelyt a well*estsblishad 
dictionary can be chosen to act as arbitrator 
whenever this problem arises* 

/ll/ Translation : many current technical 
terms have arisen by translation from other Ian* 
guagesi but sometimes a modem foreign language 



*JX>CT3MBHTA- 

TIOH 

The process 
of storing 
and retries 
ing informa* 
tlon in all 
fields of 
gleaming* 
^DOCmiEnA- 
TIQK 

The volume 
of documents 
assembled or 
-available* 
^BOCnUENTATIOR 
The title of s 
family of pvib- 
11 cations* 



6ULF0B 

SUlniLb 



o ' 

ERIC 
— ^ 



^ For instancst in three different thesauri* If these three meen* 
ings were in the same thesaurus t ^l^ny would require qualifiers 



in order to make them uniqiie* 
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or Latin term is incorporated into the spe- 
cialized Tocabulary for a particular subject, 
^n both bhe foreign language term and its 
putative translation coexist » they should 
both be included in the thesaurus and cross- 
referenced preferentially. 

/iii/ !Pransllteratloni the problem is 
further complicated when the foreign langu- 
age in question is written in a different 
alphabet* This is particularly true In the 
case of identifiers /see VI above/* The 
transliteration standards^ recommended by 
the International Organization for Standardi- 
zation should be used whenever applicable* 
'.Vherever a choice exists the transliteration 
which does not employ diacritical marks 
should be selected /see /e/ below/. 

/b/ Houn form 

The descriptor should be in the fozm of 
a noun or that part of the verb which 1 b 
grammatically equivalent. 

/c/ ibimber 

In general, the plural form should be 
used for descriptors, particularly when ge- 
neric ternis are involvetl. The singular form 
is ussd for specific material or property 
terms, process terms, proper names and dis- 
ciplinary areas* Sometimes the singular and 
plural forms of a word denote different con- 
cepts; in this case both should he entered* 

/d/ Abbreviations and acronyms 

Abbreviated word forms should be used 
only when their meaning is intsraa^ionally 



BEAEIHG HADIATIQI? 
HREM3STRAHLUNG 



SATELLITE 

SPUTNIK 



The gerund in 
English 



FORCES 

BEATING 

PAIXNQLOGar 

TEAK 

WOOD 

WOODS 



^ See page 
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established* Both abbreviated and xmabbrevlated 
loxms should be displeyed and cross-referenced 
preferentially* Sometimes the necessity for 
limiting the length of the descriptor /see /e/ 
beloff/ entails the use of leiss well established 
abbreviations* In all these cases a scope note 
/see VII /c/ above/ should be appended* 

The above remarks also apply to acroDyms* 

/e/ Character set 

Since the majority of scientific and tech- 
nical thesauri now being established will prob- 
ably be used In connexion with electronic com- 
puters » It Is advisable to use only the upper 
cese format for the descriptors* Diacritical 
marks should be avoided for the same reason. 

The need for these restrictions will prob- 
ably disappear In the near future, as the ffuits 
of technical advances become more widely dis- 
tributed, and computer manufacturers pay more 
heed to the exhortations of Information scien- 
tists to lower the costs of peripheral equip- 
ment* 

As mentioned In /d/ above, the eventual use 
of a computer may entail the limiting of the 
number of characters that a descriptor may have* 

/f / Special characters and numerals 

The only special characters allowed In des- 
criptors are left and xlejat parentheses and 
unavoidable hyphens* /Fulls tops mey sometimes 
be \ised /see IZ /b/ below/. Any other non- 
alphanumerlc symbols should be confined to scope 
notes, always within the limits of machine cha- 
racter availability* If the der crlptors contain 
numeric elements, arable numer Is shoxild be used* 
The position of the numerals should follow nor- 
mal usage *Bules must be established for the 
treatment of subscript and superscript mnerals* 




UNESCO 

United Nations 
Educational, 
Scientific and 
Cultural 
Organisation 



UERCURT 

/HANET/ 
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t 



ERIC 



Is the perticular case of data retrieval 
tbesaurli the stroke /”/”/ may some times be £B/lt 
found oecessary. 



IX. L!eth'ds of entering 
descriptors in the 
thesaurus 



/a/ Syntax 



Compound expressions consisting of two 
or more words should be listed preferably by 
direct entry i.e. not artificially inverted. 
This is especially true for /10/descriptcrsx 
for forbidden terms f this recommendation may 
be relaxed. Evidently this does not apply 
7/hen a perir*uned or key word in context type 
of multiple entry is used. Inverted entries 
may be used provided they are preferentially 
cross-referenced /see X /a//«7/hon a qualifier 
between brackets /see VII /b/ above/ forms 
part of the descriptor it is advisable t» enter 
the qualifier on its own with a preferential 
cross-reference to the complete descriptor. 



ELECTRICAL 

POWER 

nob 

POWEBf 

ELECTRICAL 



/b/ Punctuation 



Punctuation marks should not appeer in 
the descriptors. As stated in VTII /f/ above* 
the only non-alpha numeric symbols normally 
allowed in the descriptors are left and right 
parentheses. Pullstops should only be allowed 
when* due to a limit on the length of the 
descriptor, a word has to be truncated. By- 
pbens should only be used when their omlBeion 
would alter the intended meaning of the de- 
scriptcr. Commas, colons and apoetrophes should 
be excluded since they are not necessary to 
convey the meaning of the terms. Where punctoaK 
tion marks are omitted, it is advisable to^ In- 
clude them in full in the scope notes. 



LIGHT- 
SEES ITTVE 
DEVICES ElCm- 
VOLIAQE 

pnoBs 
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/c/ /i/ Speclallged vocabularies 

Certain fields have highly srcclflc sys- 
tens of Domenclaturet or well-established 
standardised technical Tocabularles • When- 
ever an Internationally agreed nomenclature 
ezlstat It should be used. 

/ll/ Specific names 

The proliferation of unrelated specific 
names would tend to convert the thesaurus 
Into a simple list of Identifiers which 
would be self defeating. It Is therefore re- 
commended that the names of unrelated spe- 
cific entitles be avoided aa much as .possible. 

/111/ Speclflc Items 

Descriptors representing generic | func- 
tional or structural concepts can be co-or- 
dinated to denote specific Items t while by 
retaining the property of being cross-refe- 
renced » they fulfil the structural needs of 
thesaurus elements. 

/d/ Alphabetise tlon 

Where appropriate# one of the followlpg 
alphabetization methods may be followed: 

/!/ letter by letter 

Vll/ word by word 

/111/ computer sort 

The selection of the method of alphabe- 
tlaetlon depende on all the factors affec- 
ting the thesaurpe under construction. l»e. 
the else and structure of the dOBoslns cove- 
red by the thsaeurus# the ewellehlllty of 
meehime pcoewmelag# the kled of hardeere 
mwnliaUe# etc* In eU c e e ee the alphehetl- 
eetlen rules etaomld be cleerly end explic- 
itly dxewa up before any kind of ordering la 
attempted. 



/■; 



i 

% 

p. 

f' 



er|c 



I; 

I' 

I 
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/e/ SynoOTM and quaal-aynonyms 

It la rare that two or more candidate 
descriptors can be considered as true 
aynopyms • When one candidate deacrlptor * 
must be searched every time that another 
Is searched, they may be treated as syno- 
nyms. Descriptors that overlap slgoiflcan- 
tly or represent different aspects of the 
same property may be considered quasi- 
synonyms. Antonyms should be similarly 
treated. When all the aynoconnst quasi- 
synonyms or antonyms are Included In the 
thesaurus display the preferential cross- 
reference should be used /aee X /a/below/. 

« 

X. Interrelationships 
between descriptors 

The most important function of a the- 
saurus la to serve as a tool for Informa- 
tion retrieval. Therefore It should orlng 
Into evidence the Interrelationship 
between Individual descriptors. These can 
be expressed by several means. If codes 
are used to Indicate theae relationships » 
their meaning should always be made clear. 

These Interrelatlonahlps are of three 
types: preferential; hierarchical; affini- 
tive. All three have the property of rec- 
iprocity! l*e. when two or more descrip- 
tors are related In any way* reciprocal 
entries are required. /Identifiers /VI 
above/ are the only exceptions to this very 
Important rule . / 

This la nece laary of the homogeneity 
of the the .xus and for *book-he aping” 
purposes. 



COLDUBlUM/SIOBiniS 

HEBEDITI/GENETICS 

HABDKESS/SOFTKESS 
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/a/ Freferentlal 

This reference la eaployed to refer from a for-* 
l>ldden term to /11/ a descriptor and vice versa . It 
is used when the meaning of descriptors overlaps 
substantially I where different spellings of the same 
word exist; for synonyms » quaslsynonyms and antonyms 
and. In generali wherever a choice has been made be- 
tween a number of descriptors! all of which are 
Included In the thesaurus display* 

/b/ Hierarchical 

Hierarchical relationships are used to exhibit 
relative d6t&-L>ees of apeciflclty within a category of 
descriptors all of which belong to a particular ge- 
neric group. This relationship Is not based upon the 
possible use or application of an entltyi but on the . 
position of the descriptor within a given class of 
concepts. Hote that certain terms may be members of 
more than one hierarchical chain. Where apy hleraxcfay 
baa more than two levels the cross-references for all 
levels should be completed for each descriptor. The 
kinds of hierarchical relatlonshlpt. which It is de- 
sirable to Indicate depend on the structure of the 
subject field of the thesaurus. In generalt all con- 
cepts which are aub-dlvislons of a broader concept 
should form of a hlerarohloal chain. 

/c/ Affinitive 

The affinitive relationship Is employed to refer 
from a descriptor to others that. are closely related 
In concept but are neither consistently hlerarehleaUy 
nor preferentially related. This relationship may be 
based on usage t applleatlont physical proximity t etc. 



Common codes 
in English 
are: i^se/ 
/Includes 
use/bBE// 
/used for 

/bpy 

ALCOHOLS 
USE Alr- 
KANOIS 
ALCAHOIS 
TTF AL- 
COBOIS 

Common codSB 
In English 
are: 

broader term 
/bT//narr- 
ower term 
speci- 
fic to/ge- 
nerlc to 

CALCULIB KT 

IHTECHtAL 

CALCULUS 

IHTEGBAL 
CALCULUS BT 
CALCULUS 

GenuB-specleB 
In zoology 

Whole-part 

Subordinate 

concepts 

are: 

related term 

/te/ 

also see 
EDUCATION 
RT LEARHIHG 
LSARHIHG 
BT EBUCAHON 



XI. Freaentation of 
them a u r u s 

It is recommended that a thesaurus be presented 
in one or more systematical displays and alphabetical 
listings /12/» 
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/a/ Sjatematleal listing 

STfftaaatleal listing /13/ refers to that fora of thesaurus 
display in which descriptors are first of all grouped in gBoeral 
elasB categories within each of which the Interrelationships 
between the deacrlptorat particularly the hierarchical rels- 
tlonahlpSf are aa self-contained as possible. Full use should 
be made of recorded *^xperlence In the field of classification 
when establishing the membership of the various facets. 

Some descriptors may appear In more than one category but 
this should occur only when either the descriptor is sccoopa— 
nled by a parenthetical qualifier or when cross-referencea are 
used. 

Thesauri that are presented In this way should always con- 
tain an alphabetical listing of all the terms Included In the 
thessurus /see HI above/. 

Systematical listing /14/ is probably better for very spe- 
cialized scientific and technic si fields than for Interdis- 
ciplinary areas. 

CaQd)lnatlon of this type of display with /c/ beloe gives 
rise to s kind of structured slphabetlcal list which probably 
combines to the fullest extent the advantages of both. 

/b/ Grdnhic display 

Perhaps the most subtle mode of presentation of thesauri 
la to display tbs descriptors and the relationships between 
them graphically. Althoo^ this can be done multl-dlmenslonaUyf 
for Instance by taking two dimensions for each facet of a 
multi-faceted thessurus t the more current methods are two-di- 
mensional. 

‘ One such system consists of arranging the descriptors In 
semantic groups t assigning a grldded sheet to each group and 
giving fixed positions to each descriptor with respect to the 
horizontal and vertical axes« thus defining co-ordlnatea. 

Interrelationships between descriptors are then shown by 
means of arrows. Associative relationships are denoted by bi- 
directional arrows. Hierarchical relationships are shown by 
unidirectional arrows always pointed to the more specific des- 
criptor. Preferential relationships may be Indicated by brac- 
kets with the arrows leaving or arriving at the preferred term. 
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It 1b tinderstood that a deacriptor WBJ belong to several 
groups* !The optimal slse of each group appears to lie beteeen 
30 and 40* As befrrei an alpbabetlcal listing should be given 
in snnaz shoving thj semantic group/s/ to which each descri^ 
tor belongs* Which mode cdT presantstion is selected will de*- 
pend on the use to which the particular thesaurus will be put* 

The latter two types of display land themselves more easily 
to translation* A rather particular type of thesaurus is the 
following* 

/c/ Alphabetical listing 

The great advantage of an alphabetical listing is that 
the introduction and correct positioning of new descriptors Is 
very easy* On the other handt it is extremely difficult to 
introduce structure into a strictly alphabetical list* Por 
instance I synonyms come more readily to mind if we thiok of a 
particular category as a whole rather than Individual descrip* 
tors plucked at random from an alphabetical list* It should be 
remembered that a particular alphabetical order is only appli- 
cable in one language* Permuted alpbabetlcal lists may also be 
used /^3/» 



Development 

XII* Periodic verification of uso- 

fulness of individual descrip* 

tors 

At least for the first few years » if not pezmanentlyy after 
the eatablishment of a thesaurus • a check should be kept on the 
frequency with which particular descriptors are utilisedy both 
for indexing and retrieval purposes* Periodic verification shouia 
ensure that certain descriptors neither interfere withy nor 
duplicate one another* On all occasions in which a search does 
not locate the desired information or the amount of information 
suspected of being in the collect! on y a critical appraisal of 
the descriptors which werSy or should have been usedy ought to 
be carried out* 
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ZXXX. Xlimlaatlon o t descriptors 

If It Is found that sogr descriptor Is being used very In* 
frequently t cere should be taken to ensure that tbs lufrequen* 
cy of usage is not due purely to tbe lack of doeuaeuts related 
to that particular concept. It aay either be ellnlnated froa 
tbe thesaurus or replaced by another more costton texa. Com- 
plete ellainatlon should occur ideally only ehen that particu- 
lar descriptor has never been usedt either for Indexing or re- 
trieval purposes. The use of a preferential relationship to 
Indicate share the replacement has been effected is more prac- 
tical. 

The Inverse la also trust if too many indexed naterials 
are assigned to the same deseriptort its specificity la lostt 
its application has become too general and tbe breaking-down 
of the concept should be considered. 

If a preferential relationship Is not usedt the date of 
Introduction of a new descriptor Into the thesaurus should be 
noted since t prior to that datet Indexers were not authorised 
to use that teim. 

Tbe procedure to be followed when a particular descriptor 
Is over or under used depends to a certain extent on the search 
strategy employed In retrieval. If the least specific descrip- 
tor Is searched for lastt It may not be worthwhile to elimina- 
te It. 

XIV. Choice of new descriptors 

Indexers and users should constantly be on the look-out te 
new candidate descriptors which aay represent either new con- 
cepts or different facets of old concepts. If possible, the 
descriptor should be used on a trial basis by indexers for some 
time before becoming a definite addition to the thesaurus. 

The frequency of occurrence of such candidate descriptors 
both as indexing and retrieval terms is a good indioation of 
thair future usefulness. If it is decided to add e new dee- 
crlptoTf tbe interreletionehipe with ell the pre-existing dss- 
crlptors should be Identified enS introduced in the appropria- 
te placet* 



Definite additions should not he Introduced singly as this 
csuses confusion among the users of the thesaurus* Hew descrip- 
tors should he saved up and Introduced hy hatches » either as 
"additions to the thesaurus'* or on the occasion of a ne« edi- 
tion of the thesaurus* This does not preclude their use hy in- 
dexers* There should exist a central authority ehlch examines 
all the suggestions received and issues a final vesrdict on the 
acceptahlllty or othereise of the posslhle nee additions* 

It should aleays he remembered that a thesaurus is never 
coDpletedf its else and shape being a function of time* 



Note: Numbers in hracicets from /1/ to /15/ signify pla- 

ces in ehich the final text "Guidelines for the Sstahlishment 
and Development***" differs slightly from the text of Project 3* 
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List of 
related 



ISO recomaendatlons 
to these Guidelines 



ISO/E 9 

ISO/R 233 

ISO/R 239 
ISO/R 704 
ISO/R 843 



**Internatlonal s/tem for the transliteration of 
Slavic Cjrllllc characters" 2nd edition. 

"International system for the transliteration of 
Arabic characters". 

"(Transliteration of Hebrew". 

"Naming principles". 

"International system for the transliteration of 
Greek characters Into Latin characters". 



ISO/R 860 "International unification of concepts and terms". 

ISO/R 919 "Guide for the preparation of classified vocabu- 

laries". 

ISO/R 1087 "Vocabulary of terminology". 

ISO/R 1149 "Layout of multilingual classified vocabularies". 

ISO/DR 1931 "Lexicographical ssnabols, particularly for use In 
/Draft/ classified defining vocabularies". 

The above documents are available either from the Headquar- 
ters of ISO /International Organisation for Standarlzatlon/, 



1 rue de Varembei Geneva 20, Switzerland. 

or from: the corresponding National Standards Organizations 

of the member countries of ISO. 



Sources for dictionaries and glossaries: 

Bibliography of interlingual scientific and technical dic- 
tionaries . 5 ed. Paris, Unesco, 1969* 250 p. 

Bibliography of monolingual scientific and technical glo - 
ssaries . Vol.I; National Standards . 1955» 219 P* Vol. II: Mis- 
cellaneous Sources . 1959i 146 p. Paris, Unesco. 

/Supplements published In Babel . International Journal of 
Translation published by the International Pederatlon of Trans- 
lators with the assistance of Unesco. Avignon, France./ 

Bibliographic Bulletin of the Clearinghouse at CIINTB. idSq. 
Warsaw, CII5TB, 1969 , 190 p. /Annual supplemants are planned./ 
Bibliographic Systems Center Subject Ind^ . Case Vestem 
Reserve University, Cleveland, Ohio, U.8.A., i989* /Coopttter 
prlnt-out./ 

Some national standards Institutions publish extensive 
uniliogual and sometimes bilingusl technical vocabularies* 
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QU£STXOMHAIRB ON FROBIiBlIS THOQGHX DBSIRABLE TO BAIBX 
BSFGBS THS KKBTINa 
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1. Definition of a thaaaurua 



a/ what does **theeaurue** aoan? 

b/ which structural eleuents /scaantlCy syntactic etc*/ 
should be Included In order to be able to call a given 
construction ^a thesaurus'*? 

c/ which eleaents and factors lafluence the organisation 
of a thesaurus? 

d/ how should tbs degree of coaplezlty and the number of 
Information Items contained In a thsssxirus be evaluated? 



2 . 



Is the concept of a thesaurus 

sufficiently complete and unl- 
vocally end exhaustively defin- 
edy or does It necessitate 
further analysis? 



5«What Is the role of a thesaurus? 
a/ direct use In Information and retrieval systems 
b/ In development of scientific Information 



4. How can thesauri be constructed? 
a/ methods of compiling thesauri 

b/ possibilities and advantages of automation In compilation 
of thesa\irl 



5«Hov can theaaurl be classified? 
According to: 
a/ branchea /subject/ 

b/ accuracy of definition of concept reletlons 
c/ the degree of hierarchy 
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d/ adopted types of basic components 
e/ coding methods? 

6»Hom can we define a descriptor? 
/Conditions which a key-word must fulfill In order to be- 
come a descriptor - the methods of constructing descrip- 
tors/. 

7»tfhat conditions must desorl p- 

tora and thesauri fulfil in 
order to ensure their Inter- 
branch and Inter — Isnguaga co r— 
r e 1 St Ion? 

6. Vhet conditions must be fu If Hi- 

ed by descriptors and thesa- 
uri as tools for the further 
derelopment of Information? 

Ve consider the abovw topics only as general outlines, and 
would be grateful If you widen the scope cf the subject matter. 
If In your opinion It is not necessary to discuss sli the 
Items 9 please give your opinion onOy on such points which yon 
consider to be of utmost importance. 
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